Mercurial > emacs
changeset 43439:dffc7bf6189c
New node Charsets.
author | Richard M. Stallman <rms@gnu.org> |
---|---|
date | Wed, 20 Feb 2002 22:36:29 +0000 |
parents | 26a4fefad579 |
children | 29dc2ac9c886 |
files | man/mule.texi |
diffstat | 1 files changed, 33 insertions(+), 22 deletions(-) [+] |
line wrap: on
line diff
--- a/man/mule.texi Wed Feb 20 22:35:56 2002 +0000 +++ b/man/mule.texi Wed Feb 20 22:36:29 2002 +0000 @@ -98,6 +98,7 @@ * Single-Byte Character Support:: You can pick one European character set to use without multibyte characters. +* Charsets:: How Emacs groups its internal character codes. @end menu @node International Chars @@ -132,28 +133,6 @@ The prefix key @kbd{C-x @key{RET}} is used for commands that pertain to multibyte characters, coding systems, and input methods. -@ignore -@c This is commented out because it doesn't fit here, or anywhere. -@c This manual does not discuss "character sets" as they -@c are used in Mule, and it makes no sense to mention these commands -@c except as part of a larger discussion of the topic. -@c But it is not clear that topic is worth mentioning here, -@c since that is more of an implementation concept -@c than a user-level concept. And when we switch to Unicode, -@c character sets in the current sense may not even exist. - -@findex list-charset-chars -@cindex characters in a certain charset - The command @kbd{M-x list-charset-chars} prompts for a name of a -character set, and displays all the characters in that character set. - -@findex describe-character-set -@cindex character set, description - The command @kbd{M-x describe-character-set} prompts for a character -set name and displays information about that character set, including -its internal representation within Emacs. -@end ignore - @node Enabling Multibyte @section Enabling Multibyte Characters @@ -1360,3 +1339,35 @@ mode is buffer-local. It can be customized for various languages with @kbd{M-x iso-accents-customize}. @end itemize + +@node Charsets +@section Charsets +@cindex charsets + + Emacs groups all supported characters into disjoint @dfn{charsets}. +Each character code belongs to one and only one charset. For +historical reasons, Emacs typically divides an 8-bit character code +for an extended version of ASCII into two charsets: ASCII, which +covers the codes 0 through 127, plus another charset which covers the +``right-hand part'' (the codes 128 and up). For instance, the +characters of Latin-1 include the Emacs charset @code{ascii} plus the +Emacs charset @code{latin-iso8859-1}. + + Emacs characters belonging to different charsets may look the same, +but they are still different characters. For example, the letter +@samp{o} with acute accent in charset @code{latin-iso8859-1}, used for +Latin-1, is different from the letter @samp{o} with acute accent in +charset @code{latin-iso8859-2}, used for Latin-2. + +@findex list-charset-chars +@cindex characters in a certain charset +@findex describe-character-set + There are two commands for obtaining information about Emacs +charsets. The command @kbd{M-x list-charset-chars} prompts for a name +of a character set, and displays all the characters in that character +set. The command @kbd{M-x describe-character-set} prompts for a +charset name and displays information about that charset, including +its internal representation within Emacs. + + To find out which charset a character in the buffer belongs to, +put point before it and type @kbd{C-u C-x =}.