comparison man/mule.texi @ 36170:0fd801cdb9fd

Clarify undisplayable characters, --unibyte, locales. Clarify self-insertion of non-ASCII 8-bit chars. Clarify coding system detection of escape sequences. Clarify keyboard input methods and coding systems. Comment out the commands to inquire about character sets. Misc cleanups.
author Richard M. Stallman <rms@gnu.org>
date Sat, 17 Feb 2001 18:12:07 +0000
parents 054acbd5e9f7
children 62cf166239f3
comparison
equal deleted inserted replaced
36169:86e871a073b6 36170:0fd801cdb9fd
40 Japanese, Korean, Lao, Thai, Tibetan, and Vietnamese scripts. These features 40 Japanese, Korean, Lao, Thai, Tibetan, and Vietnamese scripts. These features
41 have been merged from the modified version of Emacs known as MULE (for 41 have been merged from the modified version of Emacs known as MULE (for
42 ``MULti-lingual Enhancement to GNU Emacs'') 42 ``MULti-lingual Enhancement to GNU Emacs'')
43 43
44 Emacs also supports various encodings of these characters used by 44 Emacs also supports various encodings of these characters used by
45 internationalized software, such as word processors, mailers, etc. 45 other internationalized software, such as word processors and mailers.
46 46
47 @menu 47 @menu
48 * International Intro:: Basic concepts of multibyte characters. 48 * International Intro:: Basic concepts of multibyte characters.
49 * Enabling Multibyte:: Controlling whether to use multibyte characters. 49 * Enabling Multibyte:: Controlling whether to use multibyte characters.
50 * Language Environments:: Setting things up for the language you use. 50 * Language Environments:: Setting things up for the language you use.
78 cases) in the @kbd{C-q} command (@pxref{Multibyte Conversion}). 78 cases) in the @kbd{C-q} command (@pxref{Multibyte Conversion}).
79 79
80 @kindex C-h h 80 @kindex C-h h
81 @findex view-hello-file 81 @findex view-hello-file
82 @cindex undisplayable characters 82 @cindex undisplayable characters
83 @cindex ? 83 @cindex @samp{?} in display
84 @cindex ??
85 The command @kbd{C-h h} (@code{view-hello-file}) displays the file 84 The command @kbd{C-h h} (@code{view-hello-file}) displays the file
86 @file{etc/HELLO}, which shows how to say ``hello'' in many languages. 85 @file{etc/HELLO}, which shows how to say ``hello'' in many languages.
87 This illustrates various scripts. If the font you're using doesn't have 86 This illustrates various scripts. If some characters can't be
88 characters for all those different languages, you will see some hollow 87 displayed on your terminal, they appear as @samp{?} or as hollow boxes
89 boxes instead of characters; see @ref{Fontsets}. On non-windowing 88 (@pxref{Undisplayable Characters}).
90 displays, @samp{?} is displayed in place of the hollow box. More than 89
91 one @samp{?} is displayed for undisplayable characters that are wider 90 Keyboards, even in the countries where these character sets are used,
92 than one column. 91 generally don't have keys for all the characters in them. So Emacs
92 supports various @dfn{input methods}, typically one for each script or
93 language, to make it convenient to type them.
94
95 @kindex C-x RET
96 The prefix key @kbd{C-x @key{RET}} is used for commands that pertain
97 to multibyte characters, coding systems, and input methods.
98
99 @ignore
100 @c This is commented out because it doesn't fit here, or anywhere.
101 @c This manual does not discuss "character sets" as they
102 @c are used in Mule, and it makes no sense to mention these commands
103 @c except as part of a larger discussion of the topic.
104 @c But it is not clear that topic is worth mentioning here,
105 @c since that is more of an implementation concept
106 @c than a user-level concept. And when we switch to Unicode,
107 @c character sets in the current sense may not even exist.
93 108
94 @findex list-charset-chars 109 @findex list-charset-chars
95 @cindex characters in a certain charset 110 @cindex characters in a certain charset
96 The command @kbd{M-x list-charset-chars} prompts for a name of a 111 The command @kbd{M-x list-charset-chars} prompts for a name of a
97 character set, and displays all the characters in that character set. 112 character set, and displays all the characters in that character set.
99 @findex describe-character-set 114 @findex describe-character-set
100 @cindex character set, description 115 @cindex character set, description
101 The command @kbd{M-x describe-character-set} prompts for a character 116 The command @kbd{M-x describe-character-set} prompts for a character
102 set name and displays information about that character set, including 117 set name and displays information about that character set, including
103 its internal representation within Emacs. 118 its internal representation within Emacs.
104 119 @end ignore
105 Keyboards, even in the countries where these character sets are used,
106 generally don't have keys for all the characters in them. So Emacs
107 supports various @dfn{input methods}, typically one for each script or
108 language, to make it convenient to type them.
109
110 @kindex C-x RET
111 The prefix key @kbd{C-x @key{RET}} is used for commands that pertain
112 to multibyte characters, coding systems, and input methods.
113 120
114 @node Enabling Multibyte 121 @node Enabling Multibyte
115 @section Enabling Multibyte Characters 122 @section Enabling Multibyte Characters
116 123
117 You can enable or disable multibyte character support, either for 124 You can enable or disable multibyte character support, either for
151 @cindex Lisp files, and multibyte operation 158 @cindex Lisp files, and multibyte operation
152 @cindex multibyte operation, and Lisp files 159 @cindex multibyte operation, and Lisp files
153 @cindex unibyte operation, and Lisp files 160 @cindex unibyte operation, and Lisp files
154 @cindex init file, and non-ASCII characters 161 @cindex init file, and non-ASCII characters
155 @cindex environment variables, and non-ASCII characters 162 @cindex environment variables, and non-ASCII characters
156 Multibyte strings are not created during initialization from the 163 With @samp{--unibyte}, multibyte strings are not created during
157 values of environment variables, @file{/etc/passwd} entries etc.@: that 164 initialization from the values of environment variables,
158 contain non-ASCII 8-bit characters. However, Lisp files, when they are 165 @file{/etc/passwd} entries etc.@: that contain non-ASCII 8-bit
159 loaded for running, and in particular the initialization file 166 characters.
160 @file{.emacs}, are normally read as multibyte---even with 167
161 @samp{--unibyte}. To avoid multibyte strings being generated by 168 Emacs normally loads Lisp files as multibyte, regardless of whether
162 non-ASCII characters in Lisp files, put @samp{-*-unibyte: t;-*-} in a 169 you used @samp{--unibyte}. This includes the Emacs initialization
163 comment on the first line, or specify the coding system @samp{raw-text} 170 file, @file{.emacs}, and the initialization files of Emacs packages
164 with @kbd{C-x @key{RET} c}. Do the same for initialization files for 171 such as Gnus. However, you can specify unibyte loading for a
165 packages like Gnus. 172 particular Lisp file, by putting @samp{-*-unibyte: t;-*-} in a comment
173 on the first line. Then that file is always loaded as unibyte text,
174 even if you did not start Emacs with @samp{--unibyte}. The motivation
175 for these conventions is that it is more reliable to always load any
176 particular Lisp file in the same way. However, you can load a Lisp
177 file as unibyte, on any one occasion, by typing @kbd{C-x @key{RET} c
178 raw-text @key{RET}} immediately before loading it.
166 179
167 The mode line indicates whether multibyte character support is enabled 180 The mode line indicates whether multibyte character support is enabled
168 in the current buffer. If it is, there are two or more characters (most 181 in the current buffer. If it is, there are two or more characters (most
169 often two dashes) before the colon near the beginning of the mode line. 182 often two dashes) before the colon near the beginning of the mode line.
170 When multibyte characters are not enabled, just one dash precedes the 183 When multibyte characters are not enabled, just one dash precedes the
204 Latin-5, Latin-8 (Celtic), Latin-9 (updated Latin-1, with the Euro 217 Latin-5, Latin-8 (Celtic), Latin-9 (updated Latin-1, with the Euro
205 sign), Polish, Romanian, Slovak, Slovenian, Thai, Tibetan, Turkish, 218 sign), Polish, Romanian, Slovak, Slovenian, Thai, Tibetan, Turkish,
206 Dutch, Spanish, and Vietnamese. 219 Dutch, Spanish, and Vietnamese.
207 @end quotation 220 @end quotation
208 221
209 @cindex fonts, for displaying different languages 222 @cindex fonts for various scripts
210 To be able to display the script(s) used by your language environment 223 To display the script(s) used by your language environment on a
211 on a windowed display, you need to have a suitable font installed. If 224 graphical display, you need to have a suitable font. If some of the
212 some of the characters appear as empty boxes, download and install the 225 characters appear as empty boxes, you should install the GNU Intlfonts
213 GNU Intlfonts distribution, which includes fonts for all supported 226 package, which includes fonts for all supported scripts.
214 scripts. @xref{Fontsets}, for more details about setting up your 227 @xref{Fontsets}, for more details about setting up your fonts.
215 fonts.
216 228
217 @findex set-locale-environment 229 @findex set-locale-environment
218 @vindex locale-language-names 230 @vindex locale-language-names
219 @vindex locale-charset-language-names 231 @vindex locale-charset-language-names
220 @cindex locales 232 @cindex locales
221 Some operating systems let you specify the language you are using by 233 Some operating systems let you specify the language you are using by
222 setting the locale environment variables @env{LC_ALL}, @env{LC_CTYPE}, 234 setting the locale environment variables @env{LC_ALL}, @env{LC_CTYPE},
223 and @env{LANG}; the first of these which is nonempty specifies your 235 or @env{LANG}.@footnote{If more than one of these is set, the first
224 locale. Emacs handles this during startup by invoking the 236 one that is nonempty specifies your locale for this purpose.} Emacs
225 @code{set-locale-environment} function, which matches your locale 237 handles this during startup by matching your locale against entries in
226 against entries in the value of the variable 238 the value of the variables @code{locale-charset-language-names} and
227 @code{locale-language-names} and selects the corresponding language 239 @code{locale-language-names} and selects the corresponding language
228 environment if a match is found. But if your locale also matches an 240 environment if a match is found. (The former variable overrides the
229 entry in the variable @code{locale-charset-language-names}, this entry 241 latter.) It also adjusts the display table and terminal coding
230 is preferred if its character set disagrees. For example, suppose the 242 system, the locale coding system, and the preferred coding system as
231 locale @samp{en_GB.ISO8859-15} matches @code{"Latin-1"} in 243 needed for the locale.
232 @code{locale-language-names} and @code{"Latin-9"} in 244
233 @code{locale-charset-language-names}; since these two language 245 If you modify the @env{LC_ALL}, @env{LC_CTYPE}, or @env{LANG}
234 environments' character sets disagree, Emacs uses @code{"Latin-9"}. 246 environment variables while running Emacs, you may want to invoke the
235 247 @code{set-locale-environment} function afterwards to readjust the
236 If all goes well, the @code{set-locale-environment} function selects 248 language environment from the new locale.
237 the language environment, since language is part of locale. It also 249
238 adjusts the display table and terminal coding system, the locale coding
239 system, and the preferred coding system as needed for the locale.
240
241 Since the @code{set-locale-environment} function is automatically
242 invoked during startup, you normally do not need to invoke it yourself.
243 However, if you modify the @env{LC_ALL}, @env{LC_CTYPE}, or @env{LANG}
244 environment variables, you may want to invoke the
245 @code{set-locale-environment} function afterwards.
246
247 @findex set-locale-environment
248 @vindex locale-preferred-coding-systems 250 @vindex locale-preferred-coding-systems
249 The @code{set-locale-environment} function normally uses the preferred 251 The @code{set-locale-environment} function normally uses the preferred
250 coding system established by the language environment to decode system 252 coding system established by the language environment to decode system
251 messages. But if your locale matches an entry in the variable 253 messages. But if your locale matches an entry in the variable
252 @code{locale-preferred-coding-systems}, Emacs uses the corresponding 254 @code{locale-preferred-coding-systems}, Emacs uses the corresponding
253 coding system instead. For example, if the locale @samp{ja_JP.PCK} 255 coding system instead. For example, if the locale @samp{ja_JP.PCK}
254 matches @code{japanese-shift-jis} in 256 matches @code{japanese-shift-jis} in
255 @code{locale-preferred-coding-systems}, Emacs uses that encoding even 257 @code{locale-preferred-coding-systems}, Emacs uses that encoding even
256 though it might normally use @code{japanese-iso-8bit}. 258 though it might normally use @code{japanese-iso-8bit}.
257 259
258 The environment chosen from the locale when Emacs starts is 260 You can override the language environment chosen at startup with
259 overidden by any explicit use of the command 261 explicit use of the command @code{set-language-environment}, or with
260 @code{set-language-environment} or customization of 262 customization of @code{current-language-environment} in your init
261 @code{current-language-environment} in your init file. 263 file.
262 264
263 @kindex C-h L 265 @kindex C-h L
264 @findex describe-language-environment 266 @findex describe-language-environment
265 To display information about the effects of a certain language 267 To display information about the effects of a certain language
266 environment @var{lang-env}, use the command @kbd{C-h L @var{lang-env} 268 environment @var{lang-env}, use the command @kbd{C-h L @var{lang-env}
367 @code{input-method-verbose-flag} is non-@code{nil}, the list of possible 369 @code{input-method-verbose-flag} is non-@code{nil}, the list of possible
368 characters to type next is displayed in the echo area (but not when you 370 characters to type next is displayed in the echo area (but not when you
369 are in the minibuffer). 371 are in the minibuffer).
370 372
371 @cindex Leim package 373 @cindex Leim package
372 Input methods are implemented in the separate Leim package, which must 374 Input methods are implemented in the separate Leim package: they are
373 be installed with Emacs. 375 available only if the system administrator used Leim when building
376 Emacs. If Emacs was built without Leim, you will find that no input
377 methods are defined.
374 378
375 @node Select Input Method 379 @node Select Input Method
376 @section Selecting an Input Method 380 @section Selecting an Input Method
377 381
378 @table @kbd 382 @table @kbd
441 445
442 When multibyte characters are enabled, character codes 0240 (octal) 446 When multibyte characters are enabled, character codes 0240 (octal)
443 through 0377 (octal) are not really legitimate in the buffer. The valid 447 through 0377 (octal) are not really legitimate in the buffer. The valid
444 non-ASCII printing characters have codes that start from 0400. 448 non-ASCII printing characters have codes that start from 0400.
445 449
446 If you type a self-inserting character in the range 0240 450 If you type a self-inserting character in the range 0240 through
447 through 0377, Emacs assumes you intended to use one of the ISO 451 0377, or if you use @kbd{C-q} to insert one, Emacs assumes you
448 Latin-@var{n} character sets, and converts it to the Emacs code 452 intended to use one of the ISO Latin-@var{n} character sets, and
449 representing that Latin-@var{n} character. You select @emph{which} ISO 453 converts it to the Emacs code representing that Latin-@var{n}
450 Latin character set to use through your choice of language environment 454 character. You select @emph{which} ISO Latin character set to use
455 through your choice of language environment
451 @iftex 456 @iftex
452 (see above). 457 (see above).
453 @end iftex 458 @end iftex
454 @ifinfo 459 @ifinfo
455 (@pxref{Language Environments}). 460 (@pxref{Language Environments}).
456 @end ifinfo 461 @end ifinfo
457 If you do not specify a choice, the default is Latin-1. 462 If you do not specify a choice, the default is Latin-1.
458 463
459 The same thing happens when you use @kbd{C-q} to enter an octal code 464 If you insert a character in the range 0200 through 0237, which
460 in this range. If you enter a code in the range 0200 through 0237, 465 forms the @code{eight-bit-control} character set, it is inserted
461 which forms the @code{eight-bit-control} character set, it is inserted
462 literally. You should normally avoid doing this since buffers 466 literally. You should normally avoid doing this since buffers
463 containing such characters have to be written out in either the 467 containing such characters have to be written out in either the
464 @code{emacs-mule} or @code{raw-text} coding system, which is usually not 468 @code{emacs-mule} or @code{raw-text} coding system, which is usually
465 what you want. 469 not what you want.
466 470
467 @node Coding Systems 471 @node Coding Systems
468 @section Coding Systems 472 @section Coding Systems
469 @cindex coding systems 473 @cindex coding systems
470 474
650 654
651 @vindex inhibit-iso-escape-detection 655 @vindex inhibit-iso-escape-detection
652 @cindex escape sequences in files 656 @cindex escape sequences in files
653 By default, the automatic detection of coding system is sensitive to 657 By default, the automatic detection of coding system is sensitive to
654 escape sequences. If Emacs sees a sequence of characters that begin 658 escape sequences. If Emacs sees a sequence of characters that begin
655 with an @key{ESC} character, and the sequence is valid as an ISO-2022 659 with an escape character, and the sequence is valid as an ISO-2022
656 code, the code is determined as one of ISO-2022 encoding, and the file 660 code, that tells Emacs to use one of the ISO-2022 encodings to decode
657 is decoded by the corresponding coding system 661 the file.
658 (e.g. @code{iso-2022-7bit}). 662
659 663 However, there may be cases that you want to read escape sequences
660 However, there may be cases that you want to read escape sequences in 664 in a file as is. In such a case, you can set the variable
661 a file as is. In such a case, you can set th variable
662 @code{inhibit-iso-escape-detection} to non-@code{nil}. Then the code 665 @code{inhibit-iso-escape-detection} to non-@code{nil}. Then the code
663 detection will ignore any escape sequences, and so no file is detected 666 detection ignores any escape sequences, and never uses an ISO-2022
664 as being encoded in some of ISO-2022 encoding. The result is that all 667 encoding. The result is that all escape sequences become visible in
665 escape sequences become visible in a buffer. 668 the buffer.
666 669
667 The default value of @code{inhibit-iso-escape-detection} is 670 The default value of @code{inhibit-iso-escape-detection} is
668 @code{nil}, and it is strongly recommended not to change it. That's 671 @code{nil}. We recommend that you not change it permanently, only for
669 because many Emacs Lisp source files that contain non-ASCII characters 672 one specific operation. That's because many Emacs Lisp source files
670 are encoded in the coding system @code{iso-2022-7bit} in the Emacs 673 that contain non-ASCII characters are encoded in the coding system
671 distribution, and they won't be decoded correctly when you visit those 674 @code{iso-2022-7bit} in the Emacs distribution, and they won't be
672 files if you suppress the escape sequence detection. 675 decoded correctly when you visit those files if you suppress the
676 escape sequence detection.
673 677
674 @vindex coding 678 @vindex coding
675 You can specify the coding system for a particular file using the 679 You can specify the coding system for a particular file using the
676 @samp{-*-@dots{}-*-} construct at the beginning of a file, or a local 680 @samp{-*-@dots{}-*-} construct at the beginning of a file, or a local
677 variables list at the end (@pxref{File Variables}). You do this by 681 variables list at the end (@pxref{File Variables}). You do this by
698 @code{write-region}. If you want to write files from this buffer using 702 @code{write-region}. If you want to write files from this buffer using
699 a different coding system, you can specify a different coding system for 703 a different coding system, you can specify a different coding system for
700 the buffer using @code{set-buffer-file-coding-system} (@pxref{Specify 704 the buffer using @code{set-buffer-file-coding-system} (@pxref{Specify
701 Coding}). 705 Coding}).
702 706
703 While editing a file, you will sometimes insert characters which 707 You can insert any possible character into any Emacs buffer, but
704 cannot be encoded with the coding system stored in 708 most coding systems can only handle some of the possible characters.
705 @code{buffer-file-coding-system}. For example, suppose you start with 709 This means that you can insert characters that cannot be encoded with
706 an ASCII file and insert a few Latin-1 characters into it. Or you could 710 the coding system that will be used to save the buffer. For example,
707 edit a text file in Polish encoded in @code{iso-8859-2} and add to it 711 you could start with an ASCII file and insert a few Latin-1 characters
708 translations of several Polish words into Russian. When you save the 712 into it, or or you could edit a text file in Polish encoded in
709 buffer, Emacs can no longer use the previous value of the buffer's 713 @code{iso-8859-2} and add to it translations of several Polish words
710 coding system, because the characters you added cannot be encoded by 714 into Russian. When you save the buffer, Emacs cannot use the current
711 that coding system. 715 value of @code{buffer-file-coding-system}, because the characters you
716 added cannot be encoded by that coding system.
712 717
713 When that happens, Emacs tries the most-preferred coding system (set 718 When that happens, Emacs tries the most-preferred coding system (set
714 by @kbd{M-x prefer-coding-system} or @kbd{M-x 719 by @kbd{M-x prefer-coding-system} or @kbd{M-x
715 set-language-environment}), and if that coding system can safely encode 720 set-language-environment}), and if that coding system can safely
716 all of the characters in the buffer, Emacs uses it, and stores its value 721 encode all of the characters in the buffer, Emacs uses it, and stores
717 in @code{buffer-file-coding-system}. Otherwise, Emacs pops up a window 722 its value in @code{buffer-file-coding-system}. Otherwise, Emacs
718 with a list of coding systems suitable for encoding the buffer, and 723 displays a list of coding systems suitable for encoding the buffer's
719 prompts you to choose one of those coding systems. 724 contents, and asks to choose one of those coding systems.
720 725
721 If you insert characters which cannot be encoded by the buffer's 726 If you insert the unsuitable characters in a mail message, Emacs
722 coding system while editing a mail message, Emacs behaves a bit 727 behaves a bit differently. It additionally checks whether the
723 differently. It additionally checks whether the most-preferred coding 728 most-preferred coding system is recommended for use in MIME messages;
724 system is recommended for use in MIME messages; if it isn't, Emacs tells 729 if it isn't, Emacs tells you that the most-preferred coding system is
725 you that the most-preferred coding system is not recommended and prompts 730 not recommended and prompts you for another coding system. This is so
726 you for another coding system. This is so you won't inadvertently send 731 you won't inadvertently send a message encoded in a way that your
727 a message encoded in a way that your recipient's mail software will have 732 recipient's mail software will have difficulty decoding. (If you do
728 difficulty decoding. (If you do want to use the most-preferred coding 733 want to use the most-preferred coding system, you can type its name to
729 system, you can type its name to Emacs prompt anyway.) 734 Emacs prompt anyway.)
730 735
731 @vindex sendmail-coding-system 736 @vindex sendmail-coding-system
732 When you send a message with Mail mode (@pxref{Sending Mail}), Emacs has 737 When you send a message with Mail mode (@pxref{Sending Mail}), Emacs has
733 four different ways to determine the coding system to use for encoding 738 four different ways to determine the coding system to use for encoding
734 the message text. It tries the buffer's own value of 739 the message text. It tries the buffer's own value of
914 these buffers under the visited file name, saving may use the wrong file 919 these buffers under the visited file name, saving may use the wrong file
915 name, or it may get an error. If such a problem happens, use @kbd{C-x 920 name, or it may get an error. If such a problem happens, use @kbd{C-x
916 C-w} to specify a new file name for that buffer. 921 C-w} to specify a new file name for that buffer.
917 922
918 @vindex locale-coding-system 923 @vindex locale-coding-system
919 The variable @code{locale-coding-system} specifies a coding system to 924 The variable @code{locale-coding-system} specifies a coding system
920 use when encoding and decoding system strings such as system error 925 to use when encoding and decoding system strings such as system error
921 messages and @code{format-time-string} formats and time stamps. This 926 messages and @code{format-time-string} formats and time stamps. You
922 coding system should be compatible with the underlying system's coding 927 should choose a coding system that is compatible with the underlying
923 system, which is normally specified by the first environment variable in 928 system's text representation, which is normally specified by one of
924 the list @env{LC_ALL}, @env{LC_CTYPE}, @env{LANG} whose value is 929 the environment variables @env{LC_ALL}, @env{LC_CTYPE}, and
925 nonempty. 930 @env{LANG}. (The first one whose value is nonempty is the one that
931 determines the text representation.)
926 932
927 @node Fontsets 933 @node Fontsets
928 @section Fontsets 934 @section Fontsets
929 @cindex fontsets 935 @cindex fontsets
930 936
939 itself. Once you have defined a fontset, you can use it within Emacs by 945 itself. Once you have defined a fontset, you can use it within Emacs by
940 specifying its name, anywhere that you could use a single font. Of 946 specifying its name, anywhere that you could use a single font. Of
941 course, Emacs fontsets can use only the fonts that the X server 947 course, Emacs fontsets can use only the fonts that the X server
942 supports; if certain characters appear on the screen as hollow boxes, 948 supports; if certain characters appear on the screen as hollow boxes,
943 this means that the fontset in use for them has no font for those 949 this means that the fontset in use for them has no font for those
944 characters.@footnote{The installation instructions have information on 950 characters.@footnote{The Emacs installation instructions have information on
945 additional font support.} 951 additional font support.}
946 952
947 Emacs creates two fontsets automatically: the @dfn{standard fontset} 953 Emacs creates two fontsets automatically: the @dfn{standard fontset}
948 and the @dfn{startup fontset}. The standard fontset is most likely to 954 and the @dfn{startup fontset}. The standard fontset is most likely to
949 have fonts for a wide variety of non-ASCII characters; however, this is 955 have fonts for a wide variety of non-ASCII characters; however, this is
1097 @xref{Font X}, for more information about font naming in X. 1103 @xref{Font X}, for more information about font naming in X.
1098 1104
1099 @node Undisplayable Characters 1105 @node Undisplayable Characters
1100 @section Undisplayable Characters 1106 @section Undisplayable Characters
1101 1107
1102 Your terminal may not be able to display some non-@sc{ascii} characters. 1108 Your terminal may be unable to display some non-@sc{ascii}
1103 Most non-windowing terminals can only use a single character set, 1109 characters. Most non-windowing terminals can only use a single
1104 specified by the variable @code{default-terminal-coding-system} 1110 character set (use the variable @code{default-terminal-coding-system}
1105 (@pxref{Specify Coding}) and characters which can't be encoded in it are 1111 (@pxref{Specify Coding}) to tell Emacs which one); characters which
1106 displayed as @samp{?} by default. Windowing terminals may not have the 1112 can't be encoded in that coding system are displayed as @samp{?} by
1107 necessary font available to display a given character and display a 1113 default.
1108 hollow box instead. You can change the default behavior. 1114
1109 1115 Windowing terminals can display a broader range of characters, but
1110 If you use Latin-1 characters but your terminal can't display Latin-1, 1116 you may not have fonts installed for all of them; characters that have
1111 you can arrange to display mnemonic @sc{ascii} sequences instead, e.g.@: 1117 no font appear as a hollow box.
1112 @samp{"o} for o-umlaut. Load the library @file{iso-ascii} to do this. 1118
1113 1119 If you use Latin-1 characters but your terminal can't display
1114 If your terminal can display Latin-1, you can display characters from 1120 Latin-1, you can arrange to display mnemonic @sc{ascii} sequences
1115 other European character sets using a mixture of equivalent Latin-1 1121 instead, e.g.@: @samp{"o} for o-umlaut. Load the library
1116 characters and @sc{ascii} mnemonics. Use the Custom option 1122 @file{iso-ascii} to do this.
1117 @code{latin1-display} to enable this. The mnemonic @sc{ascii} sequences 1123
1118 mostly correspond to those of the prefix input methods. 1124 If your terminal can display Latin-1, you can display characters
1125 from other European character sets using a mixture of equivalent
1126 Latin-1 characters and @sc{ascii} mnemonics. Use the Custom option
1127 @code{latin1-display} to enable this. The mnemonic @sc{ascii}
1128 sequences mostly correspond to those of the prefix input methods.
1119 1129
1120 @node Single-Byte Character Support 1130 @node Single-Byte Character Support
1121 @section Single-byte Character Set Support 1131 @section Single-byte Character Set Support
1122 1132
1123 @cindex European character sets 1133 @cindex European character sets
1170 @cindex 8-bit input 1180 @cindex 8-bit input
1171 @item 1181 @item
1172 @findex set-keyboard-coding-system 1182 @findex set-keyboard-coding-system
1173 @vindex keyboard-coding-system 1183 @vindex keyboard-coding-system
1174 If your keyboard can generate character codes 128 and up, representing 1184 If your keyboard can generate character codes 128 and up, representing
1175 non-ASCII characters, use the command @code{M-x 1185 non-ASCII you can type those character codes directly.
1176 set-keyboard-coding-system} or the Custom option 1186
1177 @code{keyboard-coding-system} to specify this in the same way as for 1187 On a windowing terminal, you should not need to do anything special to
1178 multibyte usage (@pxref{Specify Coding}). 1188 use these keys; they should simply work. On a text-only terminal, you
1179 1189 should use the command @code{M-x set-keyboard-coding-system} or the
1180 It is not necessary to do this under a window system which can 1190 Custom option @code{keyboard-coding-system} to specify which coding
1181 distinguish 8-bit characters and Meta keys. If you do this on a normal 1191 system your keyboard uses (@pxref{Specify Coding}). Enabling this
1182 terminal, you will probably need to use @kbd{ESC} to type Meta 1192 feature will probably require you to use @kbd{ESC} to type Meta
1183 characters.@footnote{In some cases, such as the Linux console and 1193 characters; however, on a Linux console or in @code{xterm}, you can
1184 @code{xterm}, you can arrange for Meta to be converted to @kbd{ESC} and 1194 arrange for Meta to be converted to @kbd{ESC} and still be able type
1185 still be able type 8-bit characters present directly on the keyboard or 1195 8-bit characters present directly on the keyboard or using
1186 using @kbd{Compose} or @kbd{AltGr} keys.} @xref{User Input}. 1196 @kbd{Compose} or @kbd{AltGr} keys. @xref{User Input}.
1187 1197
1188 @item 1198 @item
1189 You can use an input method for the selected language environment. 1199 You can use an input method for the selected language environment.
1190 @xref{Input Methods}. When you use an input method in a unibyte buffer, 1200 @xref{Input Methods}. When you use an input method in a unibyte buffer,
1191 the non-ASCII character you specify with it is converted to unibyte. 1201 the non-ASCII character you specify with it is converted to unibyte.
1203 1213
1204 @kbd{C-x 8} works by loading the @code{iso-transl} library. Once that 1214 @kbd{C-x 8} works by loading the @code{iso-transl} library. Once that
1205 library is loaded, the @key{ALT} modifier key, if you have one, serves 1215 library is loaded, the @key{ALT} modifier key, if you have one, serves
1206 the same purpose as @kbd{C-x 8}; use @key{ALT} together with an accent 1216 the same purpose as @kbd{C-x 8}; use @key{ALT} together with an accent
1207 character to modify the following letter. In addition, if you have keys 1217 character to modify the following letter. In addition, if you have keys
1208 for the Latin-1 ``dead accent characters'', they too are defined to 1218 for the Latin-1 ``dead accent characters,'' they too are defined to
1209 compose with the following character, once @code{iso-transl} is loaded. 1219 compose with the following character, once @code{iso-transl} is loaded.
1210 Use @kbd{C-x 8 C-h} to list the available translations as mnemonic 1220 Use @kbd{C-x 8 C-h} to list the available translations as mnemonic
1211 command names. 1221 command names.
1212 1222
1213 @item 1223 @item
1214 @cindex @code{iso-acc} library 1224 @cindex @code{iso-acc} library
1215 @cindex ISO Accents mode 1225 @cindex ISO Accents mode
1216 @findex iso-accents-mode 1226 @findex iso-accents-mode
1217 @cindex Latin-1, Latin-2 and Latin-3 input mode 1227 @cindex Latin-1, Latin-2 and Latin-3 input mode
1218 For Latin-1, Latin-2 and Latin-3, @kbd{M-x iso-accents-mode} installs a 1228 For Latin-1, Latin-2 and Latin-3, @kbd{M-x iso-accents-mode} installs
1219 minor mode which provides a facility like the @code{latin-1-prefix} 1229 a minor mode which works much like the @code{latin-1-prefix} input
1220 input method but independent of the Leim package. This mode is 1230 method does not depend on having the input methods installed. This
1221 buffer-local. It can be customized for various languages with @kbd{M-x 1231 mode is buffer-local. It can be customized for various languages with
1222 iso-accents-customize}. 1232 @kbd{M-x iso-accents-customize}.
1223 @end itemize 1233 @end itemize