Mercurial > emacs
comparison lispref/strings.texi @ 52947:3c1778936dff
(Creating Strings): Argument START to `substring' can not be `nil'.
Expand description of `substring-no-properties'. Correct description
of `split-string', especially with respect to empty matches. Prevent
very bad line break in definition of `split-string-default-separators'.
(Text Comparison): `string=' and `string<' also accept symbols as
arguments.
(String Conversion): More completely describe argument BASE in
`string-to-number'.
(Formatting Strings): `%s' and `%S" in `format' do require corresponding
object. Clarify behavior of numeric prefix after `%' in `format'.
(Case Conversion): The argument to `upcase-initials' can be a character.
author | Luc Teirlinck <teirllm@auburn.edu> |
---|---|
date | Mon, 27 Oct 2003 15:54:13 +0000 |
parents | ead8baf4d882 |
children | 1a5c50faf357 |
comparison
equal
deleted
inserted
replaced
52946:ad0680ce76f5 | 52947:3c1778936dff |
---|---|
170 @noindent | 170 @noindent |
171 In this example, the index for @samp{e} is @minus{}3, the index for | 171 In this example, the index for @samp{e} is @minus{}3, the index for |
172 @samp{f} is @minus{}2, and the index for @samp{g} is @minus{}1. | 172 @samp{f} is @minus{}2, and the index for @samp{g} is @minus{}1. |
173 Therefore, @samp{e} and @samp{f} are included, and @samp{g} is excluded. | 173 Therefore, @samp{e} and @samp{f} are included, and @samp{g} is excluded. |
174 | 174 |
175 When @code{nil} is used as an index, it stands for the length of the | 175 When @code{nil} is used for @var{end}, it stands for the length of the |
176 string. Thus, | 176 string. Thus, |
177 | 177 |
178 @example | 178 @example |
179 @group | 179 @group |
180 (substring "abcdefg" -3 nil) | 180 (substring "abcdefg" -3 nil) |
206 @example | 206 @example |
207 (substring [a b (c) "d"] 1 3) | 207 (substring [a b (c) "d"] 1 3) |
208 @result{} [b (c)] | 208 @result{} [b (c)] |
209 @end example | 209 @end example |
210 | 210 |
211 A @code{wrong-type-argument} error is signaled if either @var{start} or | 211 A @code{wrong-type-argument} error is signaled if @var{start} is not |
212 @var{end} is not an integer or @code{nil}. An @code{args-out-of-range} | 212 an integer or if @var{end} is neither an integer nor @code{nil}. An |
213 error is signaled if @var{start} indicates a character following | 213 @code{args-out-of-range} error is signaled if @var{start} indicates a |
214 @var{end}, or if either integer is out of range for @var{string}. | 214 character following @var{end}, or if either integer is out of range |
215 for @var{string}. | |
215 | 216 |
216 Contrast this function with @code{buffer-substring} (@pxref{Buffer | 217 Contrast this function with @code{buffer-substring} (@pxref{Buffer |
217 Contents}), which returns a string containing a portion of the text in | 218 Contents}), which returns a string containing a portion of the text in |
218 the current buffer. The beginning of a string is at index 0, but the | 219 the current buffer. The beginning of a string is at index 0, but the |
219 beginning of a buffer is at index 1. | 220 beginning of a buffer is at index 1. |
220 @end defun | 221 @end defun |
221 | 222 |
222 @defun substring-no-properties string start &optional end | 223 @defun substring-no-properties string &optional start end |
223 This works like @code{substring} but discards all text properties | 224 This works like @code{substring} but discards all text properties from |
224 from the value. | 225 the value. Also, @var{start} may be omitted or @code{nil}, which is |
226 equivalent to 0. Thus, @w{@code{(substring-no-properties | |
227 @var{string})}} returns a copy of @var{string}, with all text | |
228 properties removed. | |
225 @end defun | 229 @end defun |
226 | 230 |
227 @defun concat &rest sequences | 231 @defun concat &rest sequences |
228 @cindex copying strings | 232 @cindex copying strings |
229 @cindex concatenating strings | 233 @cindex concatenating strings |
262 description of @code{mapconcat} in @ref{Mapping Functions}, | 266 description of @code{mapconcat} in @ref{Mapping Functions}, |
263 @code{vconcat} in @ref{Vector Functions}, and @code{append} in @ref{Building | 267 @code{vconcat} in @ref{Vector Functions}, and @code{append} in @ref{Building |
264 Lists}. | 268 Lists}. |
265 @end defun | 269 @end defun |
266 | 270 |
267 @defun split-string string separators omit-nulls | 271 @defun split-string string &optional separators omit-nulls |
268 This function splits @var{string} into substrings at matches for the | 272 This function splits @var{string} into substrings at matches for the |
269 regular expression @var{separators}. Each match for @var{separators} | 273 regular expression @var{separators}. Each match for @var{separators} |
270 defines a splitting point; the substrings between the splitting points | 274 defines a splitting point; the substrings between the splitting points |
271 are made into a list, which is the value returned by | 275 are made into a list, which is the value returned by |
272 @code{split-string}. | 276 @code{split-string}. |
283 As a special case, when @var{separators} is @code{nil} (or omitted), | 287 As a special case, when @var{separators} is @code{nil} (or omitted), |
284 null strings are always omitted from the result. Thus: | 288 null strings are always omitted from the result. Thus: |
285 | 289 |
286 @example | 290 @example |
287 (split-string " two words ") | 291 (split-string " two words ") |
288 @result{} ("two" "words") | 292 @result{} ("two" "words") |
289 @end example | 293 @end example |
290 | 294 |
291 The result is not @samp{("" "two" "words" "")}, which would rarely be | 295 The result is not @samp{("" "two" "words" "")}, which would rarely be |
292 useful. If you need such a result, use an explict value for | 296 useful. If you need such a result, use an explict value for |
293 @var{separators}: | 297 @var{separators}: |
294 | 298 |
295 @example | 299 @example |
296 (split-string " two words " split-string-default-separators) | 300 (split-string " two words " split-string-default-separators) |
297 @result{} ("" "two" "words" "") | 301 @result{} ("" "two" "words" "") |
298 @end example | 302 @end example |
299 | 303 |
300 More examples: | 304 More examples: |
301 | 305 |
302 @example | 306 @example |
303 (split-string "Soup is good food" "o") | 307 (split-string "Soup is good food" "o") |
304 @result{} ("S" "up is g" "" "d f" "" "d") | 308 @result{} ("S" "up is g" "" "d f" "" "d") |
305 (split-string "Soup is good food" "o" t) | 309 (split-string "Soup is good food" "o" t) |
306 @result{} ("S" "up is g" "d f" "d") | 310 @result{} ("S" "up is g" "d f" "d") |
307 (split-string "Soup is good food" "o+") | 311 (split-string "Soup is good food" "o+") |
308 @result{} ("S" "up is g" "d f" "d") | 312 @result{} ("S" "up is g" "d f" "d") |
309 @end example | 313 @end example |
310 | 314 |
311 Empty matches do count, when not adjacent to another match: | 315 Empty matches do count, except that @code{split-string} will not look |
312 | 316 for a final empty match when it already reached the end of the string |
313 @example | 317 using a non-empty match or when @var{string} is empty: |
314 (split-string "Soup is good food" "o*") | 318 |
315 @result{}("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d") | 319 @example |
316 (split-string "Nice doggy!" "") | 320 (split-string "aooob" "o*") |
317 @result{}("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!") | 321 @result{} ("" "a" "" "b" "") |
322 (split-string "ooaboo" "o*") | |
323 @result{} ("" "" "a" "b" "") | |
324 (split-string "" "") | |
325 @result{} ("") | |
326 @end example | |
327 | |
328 However, when @var{separators} can match the empty string, | |
329 @var{omit-nulls} is usually @code{t}, so that the subtleties in the | |
330 three previous examples are rarely relevant: | |
331 | |
332 @example | |
333 (split-string "Soup is good food" "o*" t) | |
334 @result{} ("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d") | |
335 (split-string "Nice doggy!" "" t) | |
336 @result{} ("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!") | |
337 (split-string "" "" t) | |
338 @result{} nil | |
339 @end example | |
340 | |
341 Somewhat odd, but predictable, behavior can occur for certain | |
342 ``non-greedy'' values of @var{separators} that can prefer empty | |
343 matches over non-empty matches. Again, such values rarely occur in | |
344 practice: | |
345 | |
346 @example | |
347 (split-string "ooo" "o*" t) | |
348 @result{} nil | |
349 (split-string "ooo" "\\|o+" t) | |
350 @result{} ("o" "o" "o") | |
318 @end example | 351 @end example |
319 @end defun | 352 @end defun |
320 | 353 |
321 @defvar split-string-default-separators | 354 @defvar split-string-default-separators |
322 The default value of @var{separators} for @code{split-string}, initially | 355 The default value of @var{separators} for @code{split-string}, initially |
323 @samp{"[ \f\t\n\r\v]+"}. | 356 @w{@samp{"[ \f\t\n\r\v]+"}}. |
324 @end defvar | 357 @end defvar |
325 | 358 |
326 @node Modifying Strings | 359 @node Modifying Strings |
327 @section Modifying Strings | 360 @section Modifying Strings |
328 | 361 |
365 @end example | 398 @end example |
366 @end defun | 399 @end defun |
367 | 400 |
368 @defun string= string1 string2 | 401 @defun string= string1 string2 |
369 This function returns @code{t} if the characters of the two strings | 402 This function returns @code{t} if the characters of the two strings |
370 match exactly. | 403 match exactly. Symbols are also allowed as arguments, in which case |
404 their print names are used. | |
371 Case is always significant, regardless of @code{case-fold-search}. | 405 Case is always significant, regardless of @code{case-fold-search}. |
372 | 406 |
373 @example | 407 @example |
374 (string= "abc" "abc") | 408 (string= "abc" "abc") |
375 @result{} t | 409 @result{} t |
439 @result{} nil | 473 @result{} nil |
440 (string< "" "") | 474 (string< "" "") |
441 @result{} nil | 475 @result{} nil |
442 @end group | 476 @end group |
443 @end example | 477 @end example |
478 | |
479 Symbols are also allowed as arguments, in which case their print names | |
480 are used. | |
444 @end defun | 481 @end defun |
445 | 482 |
446 @defun string-lessp string1 string2 | 483 @defun string-lessp string1 string2 |
447 @code{string-lessp} is another name for @code{string<}. | 484 @code{string-lessp} is another name for @code{string<}. |
448 @end defun | 485 @end defun |
543 negative. | 580 negative. |
544 | 581 |
545 @example | 582 @example |
546 (number-to-string 256) | 583 (number-to-string 256) |
547 @result{} "256" | 584 @result{} "256" |
585 @group | |
548 (number-to-string -23) | 586 (number-to-string -23) |
549 @result{} "-23" | 587 @result{} "-23" |
588 @end group | |
550 (number-to-string -23.5) | 589 (number-to-string -23.5) |
551 @result{} "-23.5" | 590 @result{} "-23.5" |
552 @end example | 591 @end example |
553 | 592 |
554 @cindex int-to-string | 593 @cindex int-to-string |
558 @end defun | 597 @end defun |
559 | 598 |
560 @defun string-to-number string &optional base | 599 @defun string-to-number string &optional base |
561 @cindex string to number | 600 @cindex string to number |
562 This function returns the numeric value of the characters in | 601 This function returns the numeric value of the characters in |
563 @var{string}. If @var{base} is non-@code{nil}, integers are converted | 602 @var{string}. If @var{base} is non-@code{nil}, it must be an integer |
564 in that base. If @var{base} is @code{nil}, then base ten is used. | 603 between 2 and 16 (inclusive), and integers are converted in that base. |
565 Floating point conversion always uses base ten; we have not implemented | 604 If @var{base} is @code{nil}, then base ten is used. Floating point |
566 other radices for floating point numbers, because that would be much | 605 conversion only works in base ten; we have not implemented other |
567 more work and does not seem useful. If @var{string} looks like an | 606 radices for floating point numbers, because that would be much more |
568 integer but its value is too large to fit into a Lisp integer, | 607 work and does not seem useful. If @var{string} looks like an integer |
608 but its value is too large to fit into a Lisp integer, | |
569 @code{string-to-number} returns a floating point result. | 609 @code{string-to-number} returns a floating point result. |
570 | 610 |
571 The parsing skips spaces and tabs at the beginning of @var{string}, then | 611 The parsing skips spaces and tabs at the beginning of @var{string}, |
572 reads as much of @var{string} as it can interpret as a number. (On some | 612 then reads as much of @var{string} as it can interpret as a number in |
573 systems it ignores other whitespace at the beginning, not just spaces | 613 the given base. (On some systems it ignores other whitespace at the |
574 and tabs.) If the first character after the ignored whitespace is | 614 beginning, not just spaces and tabs.) If the first character after |
575 neither a digit, nor a plus or minus sign, nor the leading dot of a | 615 the ignored whitespace is neither a digit in the given base, nor a |
576 floating point number, this function returns 0. | 616 plus or minus sign, nor the leading dot of a floating point number, |
617 this function returns 0. | |
577 | 618 |
578 @example | 619 @example |
579 (string-to-number "256") | 620 (string-to-number "256") |
580 @result{} 256 | 621 @result{} 256 |
581 (string-to-number "25 is a perfect square.") | 622 (string-to-number "25 is a perfect square.") |
673 | 714 |
674 Starting in Emacs 21, if the object is a string, its text properties are | 715 Starting in Emacs 21, if the object is a string, its text properties are |
675 copied into the output. The text properties of the @samp{%s} itself | 716 copied into the output. The text properties of the @samp{%s} itself |
676 are also copied, but those of the object take priority. | 717 are also copied, but those of the object take priority. |
677 | 718 |
678 If there is no corresponding object, the empty string is used. | |
679 | |
680 @item %S | 719 @item %S |
681 Replace the specification with the printed representation of the object, | 720 Replace the specification with the printed representation of the object, |
682 made with quoting (that is, using @code{prin1}---@pxref{Output | 721 made with quoting (that is, using @code{prin1}---@pxref{Output |
683 Functions}). Thus, strings are enclosed in @samp{"} characters, and | 722 Functions}). Thus, strings are enclosed in @samp{"} characters, and |
684 @samp{\} characters appear where necessary before special characters. | 723 @samp{\} characters appear where necessary before special characters. |
685 | 724 |
686 If there is no corresponding object, the empty string is used. | |
687 | |
688 @item %o | 725 @item %o |
689 @cindex integer to octal | 726 @cindex integer to octal |
690 Replace the specification with the base-eight representation of an | 727 Replace the specification with the base-eight representation of an |
691 integer. | 728 integer. |
692 | 729 |
745 @cindex numeric prefix | 782 @cindex numeric prefix |
746 @cindex field width | 783 @cindex field width |
747 @cindex padding | 784 @cindex padding |
748 All the specification characters allow an optional numeric prefix | 785 All the specification characters allow an optional numeric prefix |
749 between the @samp{%} and the character. The optional numeric prefix | 786 between the @samp{%} and the character. The optional numeric prefix |
750 defines the minimum width for the object. If the printed representation | 787 defines the minimum width for the object. If the printed |
751 of the object contains fewer characters than this, then it is padded. | 788 representation of the object contains fewer characters than this, then |
752 The padding is on the left if the prefix is positive (or starts with | 789 it is padded. The padding is on the left if the prefix is positive |
753 zero) and on the right if the prefix is negative. The padding character | 790 (or starts with zero) and on the right if the prefix is negative. The |
754 is normally a space, but if the numeric prefix starts with a zero, zeros | 791 padding character is normally a space, but if the numeric prefix |
755 are used for padding. Here are some examples of padding: | 792 starts with a zero, zeros are used for padding. Some of these |
793 conventions are ignored for specification characters for which they do | |
794 not make sense. That is, %s, %S and %c accept a numeric prefix | |
795 starting with 0, but still pad with @emph{spaces} on the left. Also, | |
796 %% accepts a numeric prefix, but ignores it. Here are some examples | |
797 of padding: | |
756 | 798 |
757 @example | 799 @example |
758 (format "%06d is padded on the left with zeros" 123) | 800 (format "%06d is padded on the left with zeros" 123) |
759 @result{} "000123 is padded on the left with zeros" | 801 @result{} "000123 is padded on the left with zeros" |
760 | 802 |
870 | 912 |
871 When the argument to @code{capitalize} is a character, @code{capitalize} | 913 When the argument to @code{capitalize} is a character, @code{capitalize} |
872 has the same result as @code{upcase}. | 914 has the same result as @code{upcase}. |
873 | 915 |
874 @example | 916 @example |
917 @group | |
875 (capitalize "The cat in the hat") | 918 (capitalize "The cat in the hat") |
876 @result{} "The Cat In The Hat" | 919 @result{} "The Cat In The Hat" |
877 | 920 @end group |
921 | |
922 @group | |
878 (capitalize "THE 77TH-HATTED CAT") | 923 (capitalize "THE 77TH-HATTED CAT") |
879 @result{} "The 77th-Hatted Cat" | 924 @result{} "The 77th-Hatted Cat" |
925 @end group | |
880 | 926 |
881 @group | 927 @group |
882 (capitalize ?x) | 928 (capitalize ?x) |
883 @result{} 88 | 929 @result{} 88 |
884 @end group | 930 @end group |
885 @end example | 931 @end example |
886 @end defun | 932 @end defun |
887 | 933 |
888 @defun upcase-initials string | 934 @defun upcase-initials string-or-char |
889 This function capitalizes the initials of the words in @var{string}, | 935 If @var{string-or-char} is a string, this function capitalizes the |
890 without altering any letters other than the initials. It returns a new | 936 initials of the words in @var{string-or-char}, without altering any |
891 string whose contents are a copy of @var{string}, in which each word has | 937 letters other than the initials. It returns a new string whose |
938 contents are a copy of @var{string-or-char}, in which each word has | |
892 had its initial letter converted to upper case. | 939 had its initial letter converted to upper case. |
893 | 940 |
894 The definition of a word is any sequence of consecutive characters that | 941 The definition of a word is any sequence of consecutive characters that |
895 are assigned to the word constituent syntax class in the current syntax | 942 are assigned to the word constituent syntax class in the current syntax |
896 table (@pxref{Syntax Class Table}). | 943 table (@pxref{Syntax Class Table}). |
944 | |
945 When the argument to @code{upcase-initials} is a character, | |
946 @code{upcase-initials} has the same result as @code{upcase}. | |
897 | 947 |
898 @example | 948 @example |
899 @group | 949 @group |
900 (upcase-initials "The CAT in the hAt") | 950 (upcase-initials "The CAT in the hAt") |
901 @result{} "The CAT In The HAt" | 951 @result{} "The CAT In The HAt" |