Mercurial > emacs
comparison man/mule.texi @ 43439:dffc7bf6189c
New node Charsets.
author | Richard M. Stallman <rms@gnu.org> |
---|---|
date | Wed, 20 Feb 2002 22:36:29 +0000 |
parents | 44bde4d34db7 |
children | 2c255d245320 |
comparison
equal
deleted
inserted
replaced
43438:26a4fefad579 | 43439:dffc7bf6189c |
---|---|
96 * Defining Fontsets:: Defining a new fontset. | 96 * Defining Fontsets:: Defining a new fontset. |
97 * Undisplayable Characters:: When characters don't display. | 97 * Undisplayable Characters:: When characters don't display. |
98 * Single-Byte Character Support:: | 98 * Single-Byte Character Support:: |
99 You can pick one European character set | 99 You can pick one European character set |
100 to use without multibyte characters. | 100 to use without multibyte characters. |
101 * Charsets:: How Emacs groups its internal character codes. | |
101 @end menu | 102 @end menu |
102 | 103 |
103 @node International Chars | 104 @node International Chars |
104 @section Introduction to International Character Sets | 105 @section Introduction to International Character Sets |
105 | 106 |
129 language, to make it convenient to type them. | 130 language, to make it convenient to type them. |
130 | 131 |
131 @kindex C-x RET | 132 @kindex C-x RET |
132 The prefix key @kbd{C-x @key{RET}} is used for commands that pertain | 133 The prefix key @kbd{C-x @key{RET}} is used for commands that pertain |
133 to multibyte characters, coding systems, and input methods. | 134 to multibyte characters, coding systems, and input methods. |
134 | |
135 @ignore | |
136 @c This is commented out because it doesn't fit here, or anywhere. | |
137 @c This manual does not discuss "character sets" as they | |
138 @c are used in Mule, and it makes no sense to mention these commands | |
139 @c except as part of a larger discussion of the topic. | |
140 @c But it is not clear that topic is worth mentioning here, | |
141 @c since that is more of an implementation concept | |
142 @c than a user-level concept. And when we switch to Unicode, | |
143 @c character sets in the current sense may not even exist. | |
144 | |
145 @findex list-charset-chars | |
146 @cindex characters in a certain charset | |
147 The command @kbd{M-x list-charset-chars} prompts for a name of a | |
148 character set, and displays all the characters in that character set. | |
149 | |
150 @findex describe-character-set | |
151 @cindex character set, description | |
152 The command @kbd{M-x describe-character-set} prompts for a character | |
153 set name and displays information about that character set, including | |
154 its internal representation within Emacs. | |
155 @end ignore | |
156 | 135 |
157 @node Enabling Multibyte | 136 @node Enabling Multibyte |
158 @section Enabling Multibyte Characters | 137 @section Enabling Multibyte Characters |
159 | 138 |
160 @cindex turn multibyte support on or off | 139 @cindex turn multibyte support on or off |
1358 a minor mode that works much like the @code{latin-1-prefix} input | 1337 a minor mode that works much like the @code{latin-1-prefix} input |
1359 method, but does not depend on having the input methods installed. This | 1338 method, but does not depend on having the input methods installed. This |
1360 mode is buffer-local. It can be customized for various languages with | 1339 mode is buffer-local. It can be customized for various languages with |
1361 @kbd{M-x iso-accents-customize}. | 1340 @kbd{M-x iso-accents-customize}. |
1362 @end itemize | 1341 @end itemize |
1342 | |
1343 @node Charsets | |
1344 @section Charsets | |
1345 @cindex charsets | |
1346 | |
1347 Emacs groups all supported characters into disjoint @dfn{charsets}. | |
1348 Each character code belongs to one and only one charset. For | |
1349 historical reasons, Emacs typically divides an 8-bit character code | |
1350 for an extended version of ASCII into two charsets: ASCII, which | |
1351 covers the codes 0 through 127, plus another charset which covers the | |
1352 ``right-hand part'' (the codes 128 and up). For instance, the | |
1353 characters of Latin-1 include the Emacs charset @code{ascii} plus the | |
1354 Emacs charset @code{latin-iso8859-1}. | |
1355 | |
1356 Emacs characters belonging to different charsets may look the same, | |
1357 but they are still different characters. For example, the letter | |
1358 @samp{o} with acute accent in charset @code{latin-iso8859-1}, used for | |
1359 Latin-1, is different from the letter @samp{o} with acute accent in | |
1360 charset @code{latin-iso8859-2}, used for Latin-2. | |
1361 | |
1362 @findex list-charset-chars | |
1363 @cindex characters in a certain charset | |
1364 @findex describe-character-set | |
1365 There are two commands for obtaining information about Emacs | |
1366 charsets. The command @kbd{M-x list-charset-chars} prompts for a name | |
1367 of a character set, and displays all the characters in that character | |
1368 set. The command @kbd{M-x describe-character-set} prompts for a | |
1369 charset name and displays information about that charset, including | |
1370 its internal representation within Emacs. | |
1371 | |
1372 To find out which charset a character in the buffer belongs to, | |
1373 put point before it and type @kbd{C-u C-x =}. |