comparison lispref/strings.texi @ 52947:3c1778936dff

(Creating Strings): Argument START to `substring' can not be `nil'. Expand description of `substring-no-properties'. Correct description of `split-string', especially with respect to empty matches. Prevent very bad line break in definition of `split-string-default-separators'. (Text Comparison): `string=' and `string<' also accept symbols as arguments. (String Conversion): More completely describe argument BASE in `string-to-number'. (Formatting Strings): `%s' and `%S" in `format' do require corresponding object. Clarify behavior of numeric prefix after `%' in `format'. (Case Conversion): The argument to `upcase-initials' can be a character.
author Luc Teirlinck <teirllm@auburn.edu>
date Mon, 27 Oct 2003 15:54:13 +0000
parents ead8baf4d882
children 1a5c50faf357
comparison
equal deleted inserted replaced
52946:ad0680ce76f5 52947:3c1778936dff
170 @noindent 170 @noindent
171 In this example, the index for @samp{e} is @minus{}3, the index for 171 In this example, the index for @samp{e} is @minus{}3, the index for
172 @samp{f} is @minus{}2, and the index for @samp{g} is @minus{}1. 172 @samp{f} is @minus{}2, and the index for @samp{g} is @minus{}1.
173 Therefore, @samp{e} and @samp{f} are included, and @samp{g} is excluded. 173 Therefore, @samp{e} and @samp{f} are included, and @samp{g} is excluded.
174 174
175 When @code{nil} is used as an index, it stands for the length of the 175 When @code{nil} is used for @var{end}, it stands for the length of the
176 string. Thus, 176 string. Thus,
177 177
178 @example 178 @example
179 @group 179 @group
180 (substring "abcdefg" -3 nil) 180 (substring "abcdefg" -3 nil)
206 @example 206 @example
207 (substring [a b (c) "d"] 1 3) 207 (substring [a b (c) "d"] 1 3)
208 @result{} [b (c)] 208 @result{} [b (c)]
209 @end example 209 @end example
210 210
211 A @code{wrong-type-argument} error is signaled if either @var{start} or 211 A @code{wrong-type-argument} error is signaled if @var{start} is not
212 @var{end} is not an integer or @code{nil}. An @code{args-out-of-range} 212 an integer or if @var{end} is neither an integer nor @code{nil}. An
213 error is signaled if @var{start} indicates a character following 213 @code{args-out-of-range} error is signaled if @var{start} indicates a
214 @var{end}, or if either integer is out of range for @var{string}. 214 character following @var{end}, or if either integer is out of range
215 for @var{string}.
215 216
216 Contrast this function with @code{buffer-substring} (@pxref{Buffer 217 Contrast this function with @code{buffer-substring} (@pxref{Buffer
217 Contents}), which returns a string containing a portion of the text in 218 Contents}), which returns a string containing a portion of the text in
218 the current buffer. The beginning of a string is at index 0, but the 219 the current buffer. The beginning of a string is at index 0, but the
219 beginning of a buffer is at index 1. 220 beginning of a buffer is at index 1.
220 @end defun 221 @end defun
221 222
222 @defun substring-no-properties string start &optional end 223 @defun substring-no-properties string &optional start end
223 This works like @code{substring} but discards all text properties 224 This works like @code{substring} but discards all text properties from
224 from the value. 225 the value. Also, @var{start} may be omitted or @code{nil}, which is
226 equivalent to 0. Thus, @w{@code{(substring-no-properties
227 @var{string})}} returns a copy of @var{string}, with all text
228 properties removed.
225 @end defun 229 @end defun
226 230
227 @defun concat &rest sequences 231 @defun concat &rest sequences
228 @cindex copying strings 232 @cindex copying strings
229 @cindex concatenating strings 233 @cindex concatenating strings
262 description of @code{mapconcat} in @ref{Mapping Functions}, 266 description of @code{mapconcat} in @ref{Mapping Functions},
263 @code{vconcat} in @ref{Vector Functions}, and @code{append} in @ref{Building 267 @code{vconcat} in @ref{Vector Functions}, and @code{append} in @ref{Building
264 Lists}. 268 Lists}.
265 @end defun 269 @end defun
266 270
267 @defun split-string string separators omit-nulls 271 @defun split-string string &optional separators omit-nulls
268 This function splits @var{string} into substrings at matches for the 272 This function splits @var{string} into substrings at matches for the
269 regular expression @var{separators}. Each match for @var{separators} 273 regular expression @var{separators}. Each match for @var{separators}
270 defines a splitting point; the substrings between the splitting points 274 defines a splitting point; the substrings between the splitting points
271 are made into a list, which is the value returned by 275 are made into a list, which is the value returned by
272 @code{split-string}. 276 @code{split-string}.
283 As a special case, when @var{separators} is @code{nil} (or omitted), 287 As a special case, when @var{separators} is @code{nil} (or omitted),
284 null strings are always omitted from the result. Thus: 288 null strings are always omitted from the result. Thus:
285 289
286 @example 290 @example
287 (split-string " two words ") 291 (split-string " two words ")
288 @result{} ("two" "words") 292 @result{} ("two" "words")
289 @end example 293 @end example
290 294
291 The result is not @samp{("" "two" "words" "")}, which would rarely be 295 The result is not @samp{("" "two" "words" "")}, which would rarely be
292 useful. If you need such a result, use an explict value for 296 useful. If you need such a result, use an explict value for
293 @var{separators}: 297 @var{separators}:
294 298
295 @example 299 @example
296 (split-string " two words " split-string-default-separators) 300 (split-string " two words " split-string-default-separators)
297 @result{} ("" "two" "words" "") 301 @result{} ("" "two" "words" "")
298 @end example 302 @end example
299 303
300 More examples: 304 More examples:
301 305
302 @example 306 @example
303 (split-string "Soup is good food" "o") 307 (split-string "Soup is good food" "o")
304 @result{} ("S" "up is g" "" "d f" "" "d") 308 @result{} ("S" "up is g" "" "d f" "" "d")
305 (split-string "Soup is good food" "o" t) 309 (split-string "Soup is good food" "o" t)
306 @result{} ("S" "up is g" "d f" "d") 310 @result{} ("S" "up is g" "d f" "d")
307 (split-string "Soup is good food" "o+") 311 (split-string "Soup is good food" "o+")
308 @result{} ("S" "up is g" "d f" "d") 312 @result{} ("S" "up is g" "d f" "d")
309 @end example 313 @end example
310 314
311 Empty matches do count, when not adjacent to another match: 315 Empty matches do count, except that @code{split-string} will not look
312 316 for a final empty match when it already reached the end of the string
313 @example 317 using a non-empty match or when @var{string} is empty:
314 (split-string "Soup is good food" "o*") 318
315 @result{}("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d") 319 @example
316 (split-string "Nice doggy!" "") 320 (split-string "aooob" "o*")
317 @result{}("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!") 321 @result{} ("" "a" "" "b" "")
322 (split-string "ooaboo" "o*")
323 @result{} ("" "" "a" "b" "")
324 (split-string "" "")
325 @result{} ("")
326 @end example
327
328 However, when @var{separators} can match the empty string,
329 @var{omit-nulls} is usually @code{t}, so that the subtleties in the
330 three previous examples are rarely relevant:
331
332 @example
333 (split-string "Soup is good food" "o*" t)
334 @result{} ("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d")
335 (split-string "Nice doggy!" "" t)
336 @result{} ("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!")
337 (split-string "" "" t)
338 @result{} nil
339 @end example
340
341 Somewhat odd, but predictable, behavior can occur for certain
342 ``non-greedy'' values of @var{separators} that can prefer empty
343 matches over non-empty matches. Again, such values rarely occur in
344 practice:
345
346 @example
347 (split-string "ooo" "o*" t)
348 @result{} nil
349 (split-string "ooo" "\\|o+" t)
350 @result{} ("o" "o" "o")
318 @end example 351 @end example
319 @end defun 352 @end defun
320 353
321 @defvar split-string-default-separators 354 @defvar split-string-default-separators
322 The default value of @var{separators} for @code{split-string}, initially 355 The default value of @var{separators} for @code{split-string}, initially
323 @samp{"[ \f\t\n\r\v]+"}. 356 @w{@samp{"[ \f\t\n\r\v]+"}}.
324 @end defvar 357 @end defvar
325 358
326 @node Modifying Strings 359 @node Modifying Strings
327 @section Modifying Strings 360 @section Modifying Strings
328 361
365 @end example 398 @end example
366 @end defun 399 @end defun
367 400
368 @defun string= string1 string2 401 @defun string= string1 string2
369 This function returns @code{t} if the characters of the two strings 402 This function returns @code{t} if the characters of the two strings
370 match exactly. 403 match exactly. Symbols are also allowed as arguments, in which case
404 their print names are used.
371 Case is always significant, regardless of @code{case-fold-search}. 405 Case is always significant, regardless of @code{case-fold-search}.
372 406
373 @example 407 @example
374 (string= "abc" "abc") 408 (string= "abc" "abc")
375 @result{} t 409 @result{} t
439 @result{} nil 473 @result{} nil
440 (string< "" "") 474 (string< "" "")
441 @result{} nil 475 @result{} nil
442 @end group 476 @end group
443 @end example 477 @end example
478
479 Symbols are also allowed as arguments, in which case their print names
480 are used.
444 @end defun 481 @end defun
445 482
446 @defun string-lessp string1 string2 483 @defun string-lessp string1 string2
447 @code{string-lessp} is another name for @code{string<}. 484 @code{string-lessp} is another name for @code{string<}.
448 @end defun 485 @end defun
543 negative. 580 negative.
544 581
545 @example 582 @example
546 (number-to-string 256) 583 (number-to-string 256)
547 @result{} "256" 584 @result{} "256"
585 @group
548 (number-to-string -23) 586 (number-to-string -23)
549 @result{} "-23" 587 @result{} "-23"
588 @end group
550 (number-to-string -23.5) 589 (number-to-string -23.5)
551 @result{} "-23.5" 590 @result{} "-23.5"
552 @end example 591 @end example
553 592
554 @cindex int-to-string 593 @cindex int-to-string
558 @end defun 597 @end defun
559 598
560 @defun string-to-number string &optional base 599 @defun string-to-number string &optional base
561 @cindex string to number 600 @cindex string to number
562 This function returns the numeric value of the characters in 601 This function returns the numeric value of the characters in
563 @var{string}. If @var{base} is non-@code{nil}, integers are converted 602 @var{string}. If @var{base} is non-@code{nil}, it must be an integer
564 in that base. If @var{base} is @code{nil}, then base ten is used. 603 between 2 and 16 (inclusive), and integers are converted in that base.
565 Floating point conversion always uses base ten; we have not implemented 604 If @var{base} is @code{nil}, then base ten is used. Floating point
566 other radices for floating point numbers, because that would be much 605 conversion only works in base ten; we have not implemented other
567 more work and does not seem useful. If @var{string} looks like an 606 radices for floating point numbers, because that would be much more
568 integer but its value is too large to fit into a Lisp integer, 607 work and does not seem useful. If @var{string} looks like an integer
608 but its value is too large to fit into a Lisp integer,
569 @code{string-to-number} returns a floating point result. 609 @code{string-to-number} returns a floating point result.
570 610
571 The parsing skips spaces and tabs at the beginning of @var{string}, then 611 The parsing skips spaces and tabs at the beginning of @var{string},
572 reads as much of @var{string} as it can interpret as a number. (On some 612 then reads as much of @var{string} as it can interpret as a number in
573 systems it ignores other whitespace at the beginning, not just spaces 613 the given base. (On some systems it ignores other whitespace at the
574 and tabs.) If the first character after the ignored whitespace is 614 beginning, not just spaces and tabs.) If the first character after
575 neither a digit, nor a plus or minus sign, nor the leading dot of a 615 the ignored whitespace is neither a digit in the given base, nor a
576 floating point number, this function returns 0. 616 plus or minus sign, nor the leading dot of a floating point number,
617 this function returns 0.
577 618
578 @example 619 @example
579 (string-to-number "256") 620 (string-to-number "256")
580 @result{} 256 621 @result{} 256
581 (string-to-number "25 is a perfect square.") 622 (string-to-number "25 is a perfect square.")
673 714
674 Starting in Emacs 21, if the object is a string, its text properties are 715 Starting in Emacs 21, if the object is a string, its text properties are
675 copied into the output. The text properties of the @samp{%s} itself 716 copied into the output. The text properties of the @samp{%s} itself
676 are also copied, but those of the object take priority. 717 are also copied, but those of the object take priority.
677 718
678 If there is no corresponding object, the empty string is used.
679
680 @item %S 719 @item %S
681 Replace the specification with the printed representation of the object, 720 Replace the specification with the printed representation of the object,
682 made with quoting (that is, using @code{prin1}---@pxref{Output 721 made with quoting (that is, using @code{prin1}---@pxref{Output
683 Functions}). Thus, strings are enclosed in @samp{"} characters, and 722 Functions}). Thus, strings are enclosed in @samp{"} characters, and
684 @samp{\} characters appear where necessary before special characters. 723 @samp{\} characters appear where necessary before special characters.
685 724
686 If there is no corresponding object, the empty string is used.
687
688 @item %o 725 @item %o
689 @cindex integer to octal 726 @cindex integer to octal
690 Replace the specification with the base-eight representation of an 727 Replace the specification with the base-eight representation of an
691 integer. 728 integer.
692 729
745 @cindex numeric prefix 782 @cindex numeric prefix
746 @cindex field width 783 @cindex field width
747 @cindex padding 784 @cindex padding
748 All the specification characters allow an optional numeric prefix 785 All the specification characters allow an optional numeric prefix
749 between the @samp{%} and the character. The optional numeric prefix 786 between the @samp{%} and the character. The optional numeric prefix
750 defines the minimum width for the object. If the printed representation 787 defines the minimum width for the object. If the printed
751 of the object contains fewer characters than this, then it is padded. 788 representation of the object contains fewer characters than this, then
752 The padding is on the left if the prefix is positive (or starts with 789 it is padded. The padding is on the left if the prefix is positive
753 zero) and on the right if the prefix is negative. The padding character 790 (or starts with zero) and on the right if the prefix is negative. The
754 is normally a space, but if the numeric prefix starts with a zero, zeros 791 padding character is normally a space, but if the numeric prefix
755 are used for padding. Here are some examples of padding: 792 starts with a zero, zeros are used for padding. Some of these
793 conventions are ignored for specification characters for which they do
794 not make sense. That is, %s, %S and %c accept a numeric prefix
795 starting with 0, but still pad with @emph{spaces} on the left. Also,
796 %% accepts a numeric prefix, but ignores it. Here are some examples
797 of padding:
756 798
757 @example 799 @example
758 (format "%06d is padded on the left with zeros" 123) 800 (format "%06d is padded on the left with zeros" 123)
759 @result{} "000123 is padded on the left with zeros" 801 @result{} "000123 is padded on the left with zeros"
760 802
870 912
871 When the argument to @code{capitalize} is a character, @code{capitalize} 913 When the argument to @code{capitalize} is a character, @code{capitalize}
872 has the same result as @code{upcase}. 914 has the same result as @code{upcase}.
873 915
874 @example 916 @example
917 @group
875 (capitalize "The cat in the hat") 918 (capitalize "The cat in the hat")
876 @result{} "The Cat In The Hat" 919 @result{} "The Cat In The Hat"
877 920 @end group
921
922 @group
878 (capitalize "THE 77TH-HATTED CAT") 923 (capitalize "THE 77TH-HATTED CAT")
879 @result{} "The 77th-Hatted Cat" 924 @result{} "The 77th-Hatted Cat"
925 @end group
880 926
881 @group 927 @group
882 (capitalize ?x) 928 (capitalize ?x)
883 @result{} 88 929 @result{} 88
884 @end group 930 @end group
885 @end example 931 @end example
886 @end defun 932 @end defun
887 933
888 @defun upcase-initials string 934 @defun upcase-initials string-or-char
889 This function capitalizes the initials of the words in @var{string}, 935 If @var{string-or-char} is a string, this function capitalizes the
890 without altering any letters other than the initials. It returns a new 936 initials of the words in @var{string-or-char}, without altering any
891 string whose contents are a copy of @var{string}, in which each word has 937 letters other than the initials. It returns a new string whose
938 contents are a copy of @var{string-or-char}, in which each word has
892 had its initial letter converted to upper case. 939 had its initial letter converted to upper case.
893 940
894 The definition of a word is any sequence of consecutive characters that 941 The definition of a word is any sequence of consecutive characters that
895 are assigned to the word constituent syntax class in the current syntax 942 are assigned to the word constituent syntax class in the current syntax
896 table (@pxref{Syntax Class Table}). 943 table (@pxref{Syntax Class Table}).
944
945 When the argument to @code{upcase-initials} is a character,
946 @code{upcase-initials} has the same result as @code{upcase}.
897 947
898 @example 948 @example
899 @group 949 @group
900 (upcase-initials "The CAT in the hAt") 950 (upcase-initials "The CAT in the hAt")
901 @result{} "The CAT In The HAt" 951 @result{} "The CAT In The HAt"