diff lispref/strings.texi @ 21682:90da2489c498

*** empty log message ***
author Richard M. Stallman <rms@gnu.org>
date Mon, 20 Apr 1998 17:43:57 +0000
parents 66d807bdc5b4
children d4ac295a98b3
line wrap: on
line diff
--- a/lispref/strings.texi	Mon Apr 20 17:37:53 1998 +0000
+++ b/lispref/strings.texi	Mon Apr 20 17:43:57 1998 +0000
@@ -29,8 +29,8 @@
 * Text Comparison::           Comparing characters or strings.
 * String Conversion::         Converting characters or strings and vice versa.
 * Formatting Strings::        @code{format}: Emacs's analog of @code{printf}.
-* Character Case::            Case conversion functions.
-* Case Table::		      Customizing case conversion.
+* Case Conversion::           Case conversion functions.
+* Case Tables::		      Customizing case conversion.
 @end menu
 
 @node String Basics
@@ -38,19 +38,19 @@
 
   Strings in Emacs Lisp are arrays that contain an ordered sequence of
 characters.  Characters are represented in Emacs Lisp as integers;
-whether an integer was intended as a character or not is determined only
-by how it is used.  Thus, strings really contain integers.
+whether an integer is a character or not is determined only by how it is
+used.  Thus, strings really contain integers.
 
   The length of a string (like any array) is fixed, and cannot be
 altered once the string exists.  Strings in Lisp are @emph{not}
 terminated by a distinguished character code.  (By contrast, strings in
 C are terminated by a character with @sc{ASCII} code 0.)
 
-  Since strings are considered arrays, you can operate on them with the
-general array functions.  (@xref{Sequences Arrays Vectors}.)  For
-example, you can access or change individual characters in a string
-using the functions @code{aref} and @code{aset} (@pxref{Array
-Functions}).
+  Since strings are arrays, and therefore sequences as well, you can
+operate on them with the general array and sequence functions.
+(@xref{Sequences Arrays Vectors}.)  For example, you can access or
+change individual characters in a string using the functions @code{aref}
+and @code{aset} (@pxref{Array Functions}).
 
   There are two text representations for non-@sc{ASCII} characters in
 Emacs strings (and in buffers): unibyte and multibyte (@pxref{Text
@@ -62,8 +62,8 @@
 
   Sometimes key sequences are represented as strings.  When a string is
 a key sequence, string elements in the range 128 to 255 represent meta
-characters (which are extremely large integers) rather than keyboard
-events in the range 128 to 255.
+characters (which are extremely large integers) rather than character
+codes in the range 128 to 255.
 
   Strings cannot hold characters that have the hyper, super or alt
 modifiers; they can hold @sc{ASCII} control characters, but no other
@@ -201,14 +201,19 @@
 If the characters copied from @var{string} have text properties, the
 properties are copied into the new string also.  @xref{Text Properties}.
 
+@code{substring} also allows vectors for the first argument.
+For example:
+
+@example
+(substring [a b (c) "d"] 1 3)
+     @result{} [b (c)]
+@end example
+
 A @code{wrong-type-argument} error is signaled if either @var{start} or
 @var{end} is not an integer or @code{nil}.  An @code{args-out-of-range}
 error is signaled if @var{start} indicates a character following
 @var{end}, or if either integer is out of range for @var{string}.
 
-@code{substring} actually allows vectors as well as strings for
-the first argument.
-
 Contrast this function with @code{buffer-substring} (@pxref{Buffer
 Contents}), which returns a string containing a portion of the text in
 the current buffer.  The beginning of a string is at index 0, but the
@@ -313,7 +318,7 @@
 @var{idx} @var{char})} stores @var{char} into @var{string} at index
 @var{idx}.  Each character occupies one or more bytes, and if @var{char}
 needs a different number of bytes from the character already present at
-that index, @code{aset} gets an error.
+that index, @code{aset} signals an error.
 
   A more powerful function is @code{store-substring}:
 
@@ -325,8 +330,8 @@
 
 Since it is impossible to change the length of an existing string, it is
 an error if @var{obj} doesn't fit within @var{string}'s actual length,
-or if it requires a different number of bytes from the characters
-currently present at that point in @var{string}.
+of if any new character requires a different number of bytes from the
+character currently present at that point in @var{string}.
 @end defun
 
 @need 2000
@@ -365,7 +370,7 @@
 strings.  When @code{equal} (@pxref{Equality Predicates}) compares two
 strings, it uses @code{string=}.
 
-If the arguments contain non-@sc{ASCII} characters, and one is unibyte
+If the strings contain non-@sc{ASCII} characters, and one is unibyte
 while the other is multibyte, then they cannot be equal.  @xref{Text
 Representations}.
 @end defun
@@ -385,11 +390,12 @@
 @var{string2}, then @var{string1} is greater, and this function returns
 @code{nil}.  If the two strings match entirely, the value is @code{nil}.
 
-Pairs of characters are compared by their @sc{ASCII} codes.  Keep in
-mind that lower case letters have higher numeric values in the
-@sc{ASCII} character set than their upper case counterparts; numbers and
+Pairs of characters are compared according to their character codes.
+Keep in mind that lower case letters have higher numeric values in the
+@sc{ASCII} character set than their upper case counterparts; digits and
 many punctuation characters have a lower numeric value than upper case
-letters.  A unibyte non-@sc{ASCII} character is always less than any
+letters.  An @sc{ASCII} character is less than any non-@sc{ASCII}
+character; a unibyte non-@sc{ASCII} character is always less than any
 multibyte non-@sc{ASCII} character (@pxref{Text Representations}).
 
 @example
@@ -453,23 +459,9 @@
 
 @defun char-to-string character
 @cindex character to string
-  This function returns a new string with a length of one character.
-The value of @var{character}, modulo 256, is used to initialize the
-element of the string.
-
-This function is similar to @code{make-string} with an integer argument
-of 1.  (@xref{Creating Strings}.)  This conversion can also be done with
-@code{format} using the @samp{%c} format specification.
-(@xref{Formatting Strings}.)
-
-@example
-(char-to-string ?x)
-     @result{} "x"
-(char-to-string (+ 256 ?x))
-     @result{} "x"
-(make-string 1 ?x)
-     @result{} "x"
-@end example
+This function returns a new string containing one character,
+@var{character}.  This function is semi-obsolete because the function
+@code{string} is more general.  @xref{Creating Strings}.
 @end defun
 
 @defun string-to-char string
@@ -579,7 +571,7 @@
 in how they use the result of formatting.
 
 @defun format string &rest objects
-  This function returns a new string that is made by copying
+This function returns a new string that is made by copying
 @var{string} and then replacing any format specification 
 in the copy with encodings of the corresponding @var{objects}.  The
 arguments @var{objects} are the computed values to be formatted.
@@ -619,7 +611,7 @@
 @item %s
 Replace the specification with the printed representation of the object,
 made without quoting (that is, using @code{princ}, not
-@code{print}---@pxref{Output Functions}).  Thus, strings are represented
+@code{prin1}---@pxref{Output Functions}).  Thus, strings are represented
 by their contents alone, with no @samp{"} characters, and symbols appear
 without @samp{\} characters.
 
@@ -740,12 +732,13 @@
 @end group
 @end smallexample
 
-@node Character Case
+@node Case Conversion
 @comment node-name, next, previous, up 
-@section Character Case
+@section Case Conversion in Lisp
 @cindex upper case 
 @cindex lower case 
 @cindex character case 
+@cindex case conversion in Lisp
 
   The character case functions change the case of single characters or
 of the contents of strings.  The functions convert only alphabetic
@@ -827,18 +820,39 @@
 @end example
 @end defun
 
-@node Case Table
+@defun upcase-initials string
+This function capitalizes the initials of the words in @var{string}.
+without altering any letters other than the initials.  It returns a new
+string whose contents are a copy of @var{string-or-char}, in which each
+word has been converted to upper case.
+
+The definition of a word is any sequence of consecutive characters that
+are assigned to the word constituent syntax class in the current syntax
+table (@xref{Syntax Class Table}).
+
+@example
+@group
+(upcase-initials "The CAT in the hAt")
+     @result{} "The CAT In The HAt"
+@end group
+@end example
+@end defun
+
+@node Case Tables
 @section The Case Table
 
   You can customize case conversion by installing a special @dfn{case
 table}.  A case table specifies the mapping between upper case and lower
-case letters.  It affects both the string and character case conversion
-functions (see the previous section) and those that apply to text in the
-buffer (@pxref{Case Changes}).
+case letters.  It affects both the case conversion functions for Lisp
+objects (see the previous section) and those that apply to text in the
+buffer (@pxref{Case Changes}).  Each buffer has a case table; there is
+also a standard case table which is used to initialize the case table
+of new buffers.
 
-  A case table is a char-table whose subtype is @code{case-table}.  This
-char-table maps each character into the corresponding lower case
-character  It has three extra slots, which are related tables:
+  A case table is a char-table (@pxref{Char-Tables}) whose subtype is
+@code{case-table}.  This char-table maps each character into the
+corresponding lower case character.  It has three extra slots, which
+hold related tables:
 
 @table @var
 @item upcase
@@ -874,17 +888,13 @@
 equivalent characters.)
 
   When you construct a case table, you can provide @code{nil} for
-@var{canonicalize}; then Emacs fills in this string from the lower case
+@var{canonicalize}; then Emacs fills in this slot from the lower case
 and upper case mappings.  You can also provide @code{nil} for
-@var{equivalences}; then Emacs fills in this string from
+@var{equivalences}; then Emacs fills in this slot from
 @var{canonicalize}.  In a case table that is actually in use, those
 components are non-@code{nil}.  Do not try to specify @var{equivalences}
 without also specifying @var{canonicalize}.
 
-  Each buffer has a case table.  Emacs also has a @dfn{standard case
-table} which is copied into each buffer when you create the buffer.
-Changing the standard case table doesn't affect any existing buffers.
-
   Here are the functions for working with case tables:
 
 @defun case-table-p object
@@ -894,7 +904,7 @@
 
 @defun set-standard-case-table table
 This function makes @var{table} the standard case table, so that it will
-apply to any buffers created subsequently.
+be used in any buffers created subsequently.
 @end defun
 
 @defun standard-case-table
@@ -912,7 +922,8 @@
   The following three functions are convenient subroutines for packages
 that define non-@sc{ASCII} character sets.  They modify the specified
 case table @var{case-table}; they also modify the standard syntax table.
-@xref{Syntax Tables}.
+@xref{Syntax Tables}.  Normally you would use these functions to change
+the standard case table.
 
 @defun set-case-syntax-pair uc lc case-table
 This function specifies a pair of corresponding letters, one upper case