comparison man/mule.texi @ 43439:dffc7bf6189c

New node Charsets.
author Richard M. Stallman <rms@gnu.org>
date Wed, 20 Feb 2002 22:36:29 +0000
parents 44bde4d34db7
children 2c255d245320
comparison
equal deleted inserted replaced
43438:26a4fefad579 43439:dffc7bf6189c
96 * Defining Fontsets:: Defining a new fontset. 96 * Defining Fontsets:: Defining a new fontset.
97 * Undisplayable Characters:: When characters don't display. 97 * Undisplayable Characters:: When characters don't display.
98 * Single-Byte Character Support:: 98 * Single-Byte Character Support::
99 You can pick one European character set 99 You can pick one European character set
100 to use without multibyte characters. 100 to use without multibyte characters.
101 * Charsets:: How Emacs groups its internal character codes.
101 @end menu 102 @end menu
102 103
103 @node International Chars 104 @node International Chars
104 @section Introduction to International Character Sets 105 @section Introduction to International Character Sets
105 106
129 language, to make it convenient to type them. 130 language, to make it convenient to type them.
130 131
131 @kindex C-x RET 132 @kindex C-x RET
132 The prefix key @kbd{C-x @key{RET}} is used for commands that pertain 133 The prefix key @kbd{C-x @key{RET}} is used for commands that pertain
133 to multibyte characters, coding systems, and input methods. 134 to multibyte characters, coding systems, and input methods.
134
135 @ignore
136 @c This is commented out because it doesn't fit here, or anywhere.
137 @c This manual does not discuss "character sets" as they
138 @c are used in Mule, and it makes no sense to mention these commands
139 @c except as part of a larger discussion of the topic.
140 @c But it is not clear that topic is worth mentioning here,
141 @c since that is more of an implementation concept
142 @c than a user-level concept. And when we switch to Unicode,
143 @c character sets in the current sense may not even exist.
144
145 @findex list-charset-chars
146 @cindex characters in a certain charset
147 The command @kbd{M-x list-charset-chars} prompts for a name of a
148 character set, and displays all the characters in that character set.
149
150 @findex describe-character-set
151 @cindex character set, description
152 The command @kbd{M-x describe-character-set} prompts for a character
153 set name and displays information about that character set, including
154 its internal representation within Emacs.
155 @end ignore
156 135
157 @node Enabling Multibyte 136 @node Enabling Multibyte
158 @section Enabling Multibyte Characters 137 @section Enabling Multibyte Characters
159 138
160 @cindex turn multibyte support on or off 139 @cindex turn multibyte support on or off
1358 a minor mode that works much like the @code{latin-1-prefix} input 1337 a minor mode that works much like the @code{latin-1-prefix} input
1359 method, but does not depend on having the input methods installed. This 1338 method, but does not depend on having the input methods installed. This
1360 mode is buffer-local. It can be customized for various languages with 1339 mode is buffer-local. It can be customized for various languages with
1361 @kbd{M-x iso-accents-customize}. 1340 @kbd{M-x iso-accents-customize}.
1362 @end itemize 1341 @end itemize
1342
1343 @node Charsets
1344 @section Charsets
1345 @cindex charsets
1346
1347 Emacs groups all supported characters into disjoint @dfn{charsets}.
1348 Each character code belongs to one and only one charset. For
1349 historical reasons, Emacs typically divides an 8-bit character code
1350 for an extended version of ASCII into two charsets: ASCII, which
1351 covers the codes 0 through 127, plus another charset which covers the
1352 ``right-hand part'' (the codes 128 and up). For instance, the
1353 characters of Latin-1 include the Emacs charset @code{ascii} plus the
1354 Emacs charset @code{latin-iso8859-1}.
1355
1356 Emacs characters belonging to different charsets may look the same,
1357 but they are still different characters. For example, the letter
1358 @samp{o} with acute accent in charset @code{latin-iso8859-1}, used for
1359 Latin-1, is different from the letter @samp{o} with acute accent in
1360 charset @code{latin-iso8859-2}, used for Latin-2.
1361
1362 @findex list-charset-chars
1363 @cindex characters in a certain charset
1364 @findex describe-character-set
1365 There are two commands for obtaining information about Emacs
1366 charsets. The command @kbd{M-x list-charset-chars} prompts for a name
1367 of a character set, and displays all the characters in that character
1368 set. The command @kbd{M-x describe-character-set} prompts for a
1369 charset name and displays information about that charset, including
1370 its internal representation within Emacs.
1371
1372 To find out which charset a character in the buffer belongs to,
1373 put point before it and type @kbd{C-u C-x =}.