emacs: lispref/objects.texi comparison

comparison lispref/objects.texi @ 72859:c5744ceda9ba

(Character Type): Node split. Add xref to Describing Characters. (Basic Char Syntax, General Escape Syntax) (Ctl-Char Syntax, Meta-Char Syntax): New subnodes.

author	Richard M. Stallman <rms@gnu.org>
date	Thu, 14 Sep 2006 01:43:18 +0000
parents	a02949a3a808
children	6d19c76d81c5 a1a25ac6c88a

comparison

equal deleted inserted replaced

-:a9629d84bf9f
+:c5744ceda9ba
 A @dfn{character} in Emacs Lisp is nothing more than an integer.  In
 other words, characters are represented by their character codes.  For
 example, the character @kbd{A} is represented as the @w{integer 65}.
-Individual characters are not often used in programs.  It is far more
+Individual characters are used occasionally in programs, but it is
-common to work with @emph{strings}, which are sequences composed of
+more common to work with @emph{strings}, which are sequences composed
-characters.  @xref{String Type}.
+of characters.  @xref{String Type}.
 Characters in strings, buffers, and files are currently limited to
 the range of 0 to 524287---nineteen bits.  But not all values in that
 range are valid character codes.  Codes 0 through 127 are
 @acronym{ASCII} codes; the rest are non-@acronym{ASCII}
 (@pxref{Non-ASCII Characters}).  Characters that represent keyboard
 input have a much wider range, to encode modifier keys such as
 Control, Meta and Shift.
+There are special functions for producing a human-readable textual
+description of a character for the sake of messages.  @xref{Describing
+Characters}.
+@menu
+* Basic Char Syntax::
+* General Escape Syntax::
+* Ctl-Char Syntax::
+* Meta-Char Syntax::
+* Other Char Bits::
+@end menu
+@node Basic Char Syntax
+@subsubsection Basic Char Syntax
 @cindex read syntax for characters
 @cindex printed representation for characters
 @cindex syntax for characters
 @cindex @samp{?} in character constant
 @cindex question mark in character constant
-Since characters are really integers, the printed representation of a
-character is a decimal number.  This is also a possible read syntax for
+Since characters are really integers, the printed representation of
-a character, but writing characters that way in Lisp programs is a very
+a character is a decimal number.  This is also a possible read syntax
-bad idea.  You should @emph{always} use the special read syntax formats
+for a character, but writing characters that way in Lisp programs is
-that Emacs Lisp provides for characters.  These syntax formats start
+not clear programming.  You should @emph{always} use the special read
-with a question mark.
+syntax formats that Emacs Lisp provides for characters.  These syntax
+formats start with a question mark.
 The usual read syntax for alphanumeric characters is a question mark
 followed by the character; thus, @samp{?A} for the character
 @kbd{A}, @samp{?B} for the character @kbd{B}, and @samp{?a} for the
 character @kbd{a}.
 @dfn{escape sequences}, because backslash plays the role of an
 ``escape character''; this terminology has nothing to do with the
 character @key{ESC}.  @samp{\s} is meant for use in character
 constants; in string constants, just write the space.
+A backslash is allowed, and harmless, preceding any character without
+a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}.
+There is no reason to add a backslash before most characters.  However,
+you should add a backslash before any of the characters
+@samp{()\|;'`"#.,} to avoid confusing the Emacs commands for editing
+Lisp code.  You can also add a backslash before whitespace characters such as
+space, tab, newline and formfeed.  However, it is cleaner to use one of
+the easily readable escape sequences, such as @samp{\t} or @samp{\s},
+instead of an actual whitespace character such as a tab or a space.
+(If you do write backslash followed by a space, you should write
+an extra space after the character constant to separate it from the
+following text.)
+@node General Escape Syntax
+@subsubsection General Escape Syntax
+In addition to the specific excape sequences for special important
+control characters, Emacs provides general categories of escape syntax
+that you can use to specify non-ASCII text characters.
+@cindex unicode character escape
+For instance, you can specify characters by their Unicode values.
+@code{?\u@var{nnnn}} represents a character that maps to the Unicode
+code point @samp{U+@var{nnnn}}.  There is a slightly different syntax
+for specifying characters with code points above @code{#xFFFF};
+@code{\U00@var{nnnnnn}} represents the character whose Unicode code
+point is @samp{U+@var{nnnnnn}}, if such a character is supported by
+Emacs.  If the corresponding character is not supported, Emacs signals
+an error.
+This peculiar and inconvenient syntax was adopted for compatibility
+with other programming languages.  Unlike some other languages, Emacs
+Lisp supports this syntax in only character literals and strings.
+@cindex @samp{\} in character constant
+@cindex backslash in character constant
+@cindex octal character code
+The most general read syntax for a character represents the
+character code in either octal or hex.  To use octal, write a question
+mark followed by a backslash and the octal character code (up to three
+octal digits); thus, @samp{?\101} for the character @kbd{A},
+@samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the
+character @kbd{C-b}.  Although this syntax can represent any
+@acronym{ASCII} character, it is preferred only when the precise octal
+value is more important than the @acronym{ASCII} representation.
+@example
+@group
+?\012 @result{} 10         ?\n @result{} 10         ?\C-j @result{} 10
+?\101 @result{} 65         ?A @result{} 65
+@end group
+@end example
+To use hex, write a question mark followed by a backslash, @samp{x},
+and the hexadecimal character code.  You can use any number of hex
+digits, so you can represent any character code in this way.
+Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the
+character @kbd{C-a}, and @code{?\x8e0} for the Latin-1 character
+@iftex
+@samp{@`a}.
+@end iftex
+@ifnottex
+@samp{a} with grave accent.
+@end ifnottex
+@node Ctl-Char Syntax
+@subsubsection Control-Character Syntax
 @cindex control characters
-Control characters may be represented using yet another read syntax.
+Control characters can be represented using yet another read syntax.
 This consists of a question mark followed by a backslash, caret, and the
 corresponding non-control character, in either upper or lower case.  For
 example, both @samp{?\^I} and @samp{?\^i} are valid read syntax for the
 character @kbd{C-i}, the character whose value is 9.
 we recommend the @samp{^} syntax; for control characters in keyboard
 input, we prefer the @samp{C-} syntax.  Which one you use does not
 affect the meaning of the program, but may guide the understanding of
 people who read it.
+@node Meta-Char Syntax
+@subsubsection Meta-Character Syntax
 @cindex meta characters
 A @dfn{meta character} is a character typed with the @key{META}
 modifier key.  The integer that represents such a character has the
 @tex
 @math{2^{27}}
 @samp{?\M-A} stands for @kbd{M-A}.  You can use @samp{\M-} together with
 octal character codes (see below), with @samp{\C-}, or with any other
 syntax for a character.  Thus, you can write @kbd{M-A} as @samp{?\M-A},
 or as @samp{?\M-\101}.  Likewise, you can write @kbd{C-M-b} as
 @samp{?\M-\C-b}, @samp{?\C-\M-b}, or @samp{?\M-\002}.
+@node Other Char Bits
+@subsubsection Other Character Modifier Bits
 The case of a graphic character is indicated by its character code;
 for example, @acronym{ASCII} distinguishes between the characters @samp{a}
 and @samp{A}.  But @acronym{ASCII} has no way to represent whether a control
 character is upper case or lower case.  Emacs uses the
 @ifnottex
 Numerically, the
 bit values are 2**22 for alt, 2**23 for super and 2**24 for hyper.
 @end ifnottex
-@cindex unicode character escape
-Emacs provides a syntax for specifying characters by their Unicode
-code points.  @code{?\u@var{nnnn}} represents a character that maps to
-the Unicode code point @samp{U+@var{nnnn}}.  There is a slightly
-different syntax for specifying characters with code points above
-@code{#xFFFF}; @code{\U00@var{nnnnnn}} represents the character whose
-Unicode code point is @samp{U+@var{nnnnnn}}, if such a character
-is supported by Emacs.  If the corresponding character is not
-supported, Emacs signals an error.
-This peculiar and inconvenient syntax was adopted for compatibility
-with other programming languages.  Unlike some other languages, Emacs
-Lisp supports this syntax in only character literals and strings.
-@cindex @samp{\} in character constant
-@cindex backslash in character constant
-@cindex octal character code
-Finally, the most general read syntax for a character represents the
-character code in either octal or hex.  To use octal, write a question
-mark followed by a backslash and the octal character code (up to three
-octal digits); thus, @samp{?\101} for the character @kbd{A},
-@samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the
-character @kbd{C-b}.  Although this syntax can represent any @acronym{ASCII}
-character, it is preferred only when the precise octal value is more
-important than the @acronym{ASCII} representation.
-@example
-@group
-?\012 @result{} 10         ?\n @result{} 10         ?\C-j @result{} 10
-?\101 @result{} 65         ?A @result{} 65
-@end group
-@end example
-To use hex, write a question mark followed by a backslash, @samp{x},
-and the hexadecimal character code.  You can use any number of hex
-digits, so you can represent any character code in this way.
-Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the
-character @kbd{C-a}, and @code{?\x8e0} for the Latin-1 character
-@iftex
-@samp{@`a}.
-@end iftex
-@ifnottex
-@samp{a} with grave accent.
-@end ifnottex
-A backslash is allowed, and harmless, preceding any character without
-a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}.
-There is no reason to add a backslash before most characters.  However,
-you should add a backslash before any of the characters
-@samp{()\|;'`"#.,} to avoid confusing the Emacs commands for editing
-Lisp code.  You can also add a backslash before whitespace characters such as
-space, tab, newline and formfeed.  However, it is cleaner to use one of
-the easily readable escape sequences, such as @samp{\t} or @samp{\s},
-instead of an actual whitespace character such as a tab or a space.
-(If you do write backslash followed by a space, you should write
-an extra space after the character constant to separate it from the
-following text.)
 @node Symbol Type
 @subsection Symbol Type
 A @dfn{symbol} in GNU Emacs Lisp is an object with a name.  The
 symbol name serves as the printed representation of the symbol.  In

Mercurial > emacs

comparison lispref/objects.texi @ 72859:c5744ceda9ba