Mercurial > emacs
comparison man/mule.texi @ 38460:6bee7ffac2cd
Proofreading fixes from Chris Green <chris_e_green@yahoo.com>
and "J. Otto Tennant" <jot@visi.com>.
author | Eli Zaretskii <eliz@gnu.org> |
---|---|
date | Tue, 17 Jul 2001 10:39:21 +0000 |
parents | 4eaf5126c0e5 |
children | 1518ad710658 |
comparison
equal
deleted
inserted
replaced
38459:08aca6a91513 | 38460:6bee7ffac2cd |
---|---|
215 | 215 |
216 Emacs normally loads Lisp files as multibyte, regardless of whether | 216 Emacs normally loads Lisp files as multibyte, regardless of whether |
217 you used @samp{--unibyte}. This includes the Emacs initialization | 217 you used @samp{--unibyte}. This includes the Emacs initialization |
218 file, @file{.emacs}, and the initialization files of Emacs packages | 218 file, @file{.emacs}, and the initialization files of Emacs packages |
219 such as Gnus. However, you can specify unibyte loading for a | 219 such as Gnus. However, you can specify unibyte loading for a |
220 particular Lisp file, by putting @samp{-*-unibyte: t;-*-} in a comment | 220 particular Lisp file, by putting @w{@samp{-*-unibyte: t;-*-}} in a |
221 on the first line. Then that file is always loaded as unibyte text, | 221 comment on the first line. Then that file is always loaded as unibyte |
222 even if you did not start Emacs with @samp{--unibyte}. The motivation | 222 text, even if you did not start Emacs with @samp{--unibyte}. The |
223 for these conventions is that it is more reliable to always load any | 223 motivation for these conventions is that it is more reliable to always |
224 particular Lisp file in the same way. However, you can load a Lisp | 224 load any particular Lisp file in the same way. However, you can load |
225 file as unibyte, on any one occasion, by typing @kbd{C-x @key{RET} c | 225 a Lisp file as unibyte, on any one occasion, by typing @kbd{C-x |
226 raw-text @key{RET}} immediately before loading it. | 226 @key{RET} c raw-text @key{RET}} immediately before loading it. |
227 | 227 |
228 The mode line indicates whether multibyte character support is enabled | 228 The mode line indicates whether multibyte character support is enabled |
229 in the current buffer. If it is, there are two or more characters (most | 229 in the current buffer. If it is, there are two or more characters (most |
230 often two dashes) before the colon near the beginning of the mode line. | 230 often two dashes) before the colon near the beginning of the mode line. |
231 When multibyte characters are not enabled, just one dash precedes the | 231 When multibyte characters are not enabled, just one dash precedes the |
300 table and terminal coding system, the locale coding system, and the | 300 table and terminal coding system, the locale coding system, and the |
301 preferred coding system as needed for the locale. | 301 preferred coding system as needed for the locale. |
302 | 302 |
303 If you modify the @env{LC_ALL}, @env{LC_CTYPE}, or @env{LANG} | 303 If you modify the @env{LC_ALL}, @env{LC_CTYPE}, or @env{LANG} |
304 environment variables while running Emacs, you may want to invoke the | 304 environment variables while running Emacs, you may want to invoke the |
305 @code{set-locale-environment} function afterwards to readjust the | 305 @code{set-locale-environment} function afterwards to re-adjust the |
306 language environment from the new locale. | 306 language environment from the new locale. |
307 | 307 |
308 @vindex locale-preferred-coding-systems | 308 @vindex locale-preferred-coding-systems |
309 The @code{set-locale-environment} function normally uses the preferred | 309 The @code{set-locale-environment} function normally uses the preferred |
310 coding system established by the language environment to decode system | 310 coding system established by the language environment to decode system |
361 has its own input method; sometimes several languages which use the same | 361 has its own input method; sometimes several languages which use the same |
362 characters can share one input method. A few languages support several | 362 characters can share one input method. A few languages support several |
363 input methods. | 363 input methods. |
364 | 364 |
365 The simplest kind of input method works by mapping ASCII letters | 365 The simplest kind of input method works by mapping ASCII letters |
366 into another alphabet; this allows you to type characters which your | 366 into another alphabet; this allows you to type characters that your |
367 keyboard doesn't support directly. This is how the Greek and Russian | 367 keyboard doesn't support directly. This is how the Greek and Russian |
368 input methods work. | 368 input methods work. |
369 | 369 |
370 A more powerful technique is composition: converting sequences of | 370 A more powerful technique is composition: converting sequences of |
371 characters into one letter. Many European input methods use composition | 371 characters into one letter. Many European input methods use composition |
403 characters you have just entered will not combine with subsequent | 403 characters you have just entered will not combine with subsequent |
404 characters. For example, in input method @code{latin-1-postfix}, the | 404 characters. For example, in input method @code{latin-1-postfix}, the |
405 sequence @kbd{e '} combines to form an @samp{e} with an accent. What if | 405 sequence @kbd{e '} combines to form an @samp{e} with an accent. What if |
406 you want to enter them as separate characters? | 406 you want to enter them as separate characters? |
407 | 407 |
408 One way is to type the accent twice; that is a special feature for | 408 One way is to type the accent twice; this is a special feature for |
409 entering the separate letter and accent. For example, @kbd{e ' '} gives | 409 entering the separate letter and accent. For example, @kbd{e ' '} gives |
410 you the two characters @samp{e'}. Another way is to type another letter | 410 you the two characters @samp{e'}. Another way is to type another letter |
411 after the @kbd{e}---something that won't combine with that---and | 411 after the @kbd{e}---something that won't combine with that---and |
412 immediately delete it. For example, you could type @kbd{e e @key{DEL} | 412 immediately delete it. For example, you could type @kbd{e e @key{DEL} |
413 '} to get separate @samp{e} and @samp{'}. | 413 '} to get separate @samp{e} and @samp{'}. |
468 @findex set-input-method | 468 @findex set-input-method |
469 @vindex current-input-method | 469 @vindex current-input-method |
470 @kindex C-x RET C-\ | 470 @kindex C-x RET C-\ |
471 To choose an input method for the current buffer, use @kbd{C-x | 471 To choose an input method for the current buffer, use @kbd{C-x |
472 @key{RET} C-\} (@code{set-input-method}). This command reads the | 472 @key{RET} C-\} (@code{set-input-method}). This command reads the |
473 input method name with the minibuffer; the name normally starts with the | 473 input method name from the minibuffer; the name normally starts with the |
474 language environment that it is meant to be used with. The variable | 474 language environment that it is meant to be used with. The variable |
475 @code{current-input-method} records which input method is selected. | 475 @code{current-input-method} records which input method is selected. |
476 | 476 |
477 @findex toggle-input-method | 477 @findex toggle-input-method |
478 @kindex C-\ | 478 @kindex C-\ |
604 | 604 |
605 @kindex C-h C | 605 @kindex C-h C |
606 @findex describe-coding-system | 606 @findex describe-coding-system |
607 The command @kbd{C-h C} (@code{describe-coding-system}) displays | 607 The command @kbd{C-h C} (@code{describe-coding-system}) displays |
608 information about particular coding systems. You can specify a coding | 608 information about particular coding systems. You can specify a coding |
609 system name as argument; alternatively, with an empty argument, it | 609 system name as the argument; alternatively, with an empty argument, it |
610 describes the coding systems currently selected for various purposes, | 610 describes the coding systems currently selected for various purposes, |
611 both in the current buffer and as the defaults, and the priority list | 611 both in the current buffer and as the defaults, and the priority list |
612 for recognizing coding systems (@pxref{Recognize Coding}). | 612 for recognizing coding systems (@pxref{Recognize Coding}). |
613 | 613 |
614 @findex list-coding-systems | 614 @findex list-coding-systems |
716 list, so that it is preferred to all others. If you use this command | 716 list, so that it is preferred to all others. If you use this command |
717 several times, each use adds one element to the front of the priority | 717 several times, each use adds one element to the front of the priority |
718 list. | 718 list. |
719 | 719 |
720 If you use a coding system that specifies the end-of-line conversion | 720 If you use a coding system that specifies the end-of-line conversion |
721 type, such as @code{iso-8859-1-dos}, what that means is that Emacs | 721 type, such as @code{iso-8859-1-dos}, what this means is that Emacs |
722 should attempt to recognize @code{iso-8859-1} with priority, and should | 722 should attempt to recognize @code{iso-8859-1} with priority, and should |
723 use DOS end-of-line conversion in case it recognizes @code{iso-8859-1}. | 723 use DOS end-of-line conversion if it recognizes @code{iso-8859-1}. |
724 | 724 |
725 @vindex file-coding-system-alist | 725 @vindex file-coding-system-alist |
726 Sometimes a file name indicates which coding system to use for the | 726 Sometimes a file name indicates which coding system to use for the |
727 file. The variable @code{file-coding-system-alist} specifies this | 727 file. The variable @code{file-coding-system-alist} specifies this |
728 correspondence. There is a special function | 728 correspondence. There is a special function |
768 the buffer. | 768 the buffer. |
769 | 769 |
770 The default value of @code{inhibit-iso-escape-detection} is | 770 The default value of @code{inhibit-iso-escape-detection} is |
771 @code{nil}. We recommend that you not change it permanently, only for | 771 @code{nil}. We recommend that you not change it permanently, only for |
772 one specific operation. That's because many Emacs Lisp source files | 772 one specific operation. That's because many Emacs Lisp source files |
773 that contain non-ASCII characters are encoded in the coding system | 773 in the Emacs distribution contain non-ASCII characters encoded in the |
774 @code{iso-2022-7bit} in the Emacs distribution, and they won't be | 774 coding system @code{iso-2022-7bit}, and they won't be |
775 decoded correctly when you visit those files if you suppress the | 775 decoded correctly when you visit those files if you suppress the |
776 escape sequence detection. | 776 escape sequence detection. |
777 | 777 |
778 @vindex coding | 778 @vindex coding |
779 You can specify the coding system for a particular file using the | 779 You can specify the coding system for a particular file using the |
780 @samp{-*-@dots{}-*-} construct at the beginning of a file, or a local | 780 @w{@samp{-*-@dots{}-*-}} construct at the beginning of a file, or a |
781 variables list at the end (@pxref{File Variables}). You do this by | 781 local variables list at the end (@pxref{File Variables}). You do this |
782 defining a value for the ``variable'' named @code{coding}. Emacs does | 782 by defining a value for the ``variable'' named @code{coding}. Emacs |
783 not really have a variable @code{coding}; instead of setting a variable, | 783 does not really have a variable @code{coding}; instead of setting a |
784 it uses the specified coding system for the file. For example, | 784 variable, it uses the specified coding system for the file. For |
785 @samp{-*-mode: C; coding: latin-1;-*-} specifies use of the Latin-1 | 785 example, @samp{-*-mode: C; coding: latin-1;-*-} specifies use of the |
786 coding system, as well as C mode. If you specify the coding explicitly | 786 Latin-1 coding system, as well as C mode. If you specify the coding |
787 in the file, that overrides @code{file-coding-system-alist}. | 787 explicitly in the file, that overrides |
788 @code{file-coding-system-alist}. | |
788 | 789 |
789 @vindex auto-coding-alist | 790 @vindex auto-coding-alist |
790 @vindex auto-coding-regexp-alist | 791 @vindex auto-coding-regexp-alist |
791 The variables @code{auto-coding-alist} and | 792 The variables @code{auto-coding-alist} and |
792 @code{auto-coding-regexp-alist} are the strongest way to specify the | 793 @code{auto-coding-regexp-alist} are the strongest way to specify the |
817 the buffer using @code{set-buffer-file-coding-system} (@pxref{Specify | 818 the buffer using @code{set-buffer-file-coding-system} (@pxref{Specify |
818 Coding}). | 819 Coding}). |
819 | 820 |
820 You can insert any possible character into any Emacs buffer, but | 821 You can insert any possible character into any Emacs buffer, but |
821 most coding systems can only handle some of the possible characters. | 822 most coding systems can only handle some of the possible characters. |
822 This means that you can insert characters that cannot be encoded with | 823 This means that it is possible for you to insert characters that |
823 the coding system that will be used to save the buffer. For example, | 824 cannot be encoded with the coding system that will be used to save the |
824 you could start with an ASCII file and insert a few Latin-1 characters | 825 buffer. For example, you could start with an ASCII file and insert a |
825 into it, or you could edit a text file in Polish encoded in | 826 few Latin-1 characters into it, or you could edit a text file in |
826 @code{iso-8859-2} and add to it translations of several Polish words | 827 Polish encoded in @code{iso-8859-2} and add to it translations of |
827 into Russian. When you save the buffer, Emacs cannot use the current | 828 several Polish words into Russian. When you save the buffer, Emacs |
828 value of @code{buffer-file-coding-system}, because the characters you | 829 cannot use the current value of @code{buffer-file-coding-system}, |
829 added cannot be encoded by that coding system. | 830 because the characters you added cannot be encoded by that coding |
831 system. | |
830 | 832 |
831 When that happens, Emacs tries the most-preferred coding system (set | 833 When that happens, Emacs tries the most-preferred coding system (set |
832 by @kbd{M-x prefer-coding-system} or @kbd{M-x | 834 by @kbd{M-x prefer-coding-system} or @kbd{M-x |
833 set-language-environment}), and if that coding system can safely | 835 set-language-environment}), and if that coding system can safely |
834 encode all of the characters in the buffer, Emacs uses it, and stores | 836 encode all of the characters in the buffer, Emacs uses it, and stores |
857 if that is non-@code{nil}. If all of these three values are @code{nil}, | 859 if that is non-@code{nil}. If all of these three values are @code{nil}, |
858 Emacs encodes outgoing mail using the Latin-1 coding system. | 860 Emacs encodes outgoing mail using the Latin-1 coding system. |
859 | 861 |
860 @vindex rmail-decode-mime-charset | 862 @vindex rmail-decode-mime-charset |
861 When you get new mail in Rmail, each message is translated | 863 When you get new mail in Rmail, each message is translated |
862 automatically from the coding system it is written in---as if it were a | 864 automatically from the coding system it is written in, as if it were a |
863 separate file. This uses the priority list of coding systems that you | 865 separate file. This uses the priority list of coding systems that you |
864 have specified. If a MIME message specifies a character set, Rmail | 866 have specified. If a MIME message specifies a character set, Rmail |
865 obeys that specification, unless @code{rmail-decode-mime-charset} is | 867 obeys that specification, unless @code{rmail-decode-mime-charset} is |
866 @code{nil}. | 868 @code{nil}. |
867 | 869 |
1039 to use when encoding and decoding system strings such as system error | 1041 to use when encoding and decoding system strings such as system error |
1040 messages and @code{format-time-string} formats and time stamps. You | 1042 messages and @code{format-time-string} formats and time stamps. You |
1041 should choose a coding system that is compatible with the underlying | 1043 should choose a coding system that is compatible with the underlying |
1042 system's text representation, which is normally specified by one of | 1044 system's text representation, which is normally specified by one of |
1043 the environment variables @env{LC_ALL}, @env{LC_CTYPE}, and | 1045 the environment variables @env{LC_ALL}, @env{LC_CTYPE}, and |
1044 @env{LANG}. (The first one whose value is nonempty is the one that | 1046 @env{LANG}. (The first one, in the order specified above, whose value |
1045 determines the text representation.) | 1047 is nonempty is the one that determines the text representation.) |
1046 | 1048 |
1047 @node Fontsets | 1049 @node Fontsets |
1048 @section Fontsets | 1050 @section Fontsets |
1049 @cindex fontsets | 1051 @cindex fontsets |
1050 | 1052 |
1051 A font for X typically defines shapes for one alphabet or script. | 1053 A font for X typically defines shapes for a single alphabet or script. |
1052 Therefore, displaying the entire range of scripts that Emacs supports | 1054 Therefore, displaying the entire range of scripts that Emacs supports |
1053 requires a collection of many fonts. In Emacs, such a collection is | 1055 requires a collection of many fonts. In Emacs, such a collection is |
1054 called a @dfn{fontset}. A fontset is defined by a list of fonts, each | 1056 called a @dfn{fontset}. A fontset is defined by a list of fonts, each |
1055 assigned to handle a range of character codes. | 1057 assigned to handle a range of character codes. |
1056 | 1058 |
1066 | 1068 |
1067 Emacs creates two fontsets automatically: the @dfn{standard fontset} | 1069 Emacs creates two fontsets automatically: the @dfn{standard fontset} |
1068 and the @dfn{startup fontset}. The standard fontset is most likely to | 1070 and the @dfn{startup fontset}. The standard fontset is most likely to |
1069 have fonts for a wide variety of non-ASCII characters; however, this is | 1071 have fonts for a wide variety of non-ASCII characters; however, this is |
1070 not the default for Emacs to use. (By default, Emacs tries to find a | 1072 not the default for Emacs to use. (By default, Emacs tries to find a |
1071 font which has bold and italic variants.) You can specify use of the | 1073 font that has bold and italic variants.) You can specify use of the |
1072 standard fontset with the @samp{-fn} option, or with the @samp{Font} X | 1074 standard fontset with the @samp{-fn} option, or with the @samp{Font} X |
1073 resource (@pxref{Font X}). For example, | 1075 resource (@pxref{Font X}). For example, |
1074 | 1076 |
1075 @example | 1077 @example |
1076 emacs -fn fontset-standard | 1078 emacs -fn fontset-standard |
1134 @end example | 1136 @end example |
1135 | 1137 |
1136 With the X resource @samp{Emacs.Font}, you can specify a fontset name | 1138 With the X resource @samp{Emacs.Font}, you can specify a fontset name |
1137 just like an actual font name. But be careful not to specify a fontset | 1139 just like an actual font name. But be careful not to specify a fontset |
1138 name in a wildcard resource like @samp{Emacs*Font}---that wildcard | 1140 name in a wildcard resource like @samp{Emacs*Font}---that wildcard |
1139 specification applies to various other purposes, such as menus, and | 1141 specification is used for various other purposes, such as menus, and |
1140 menus cannot handle fontsets. | 1142 menus cannot handle fontsets. |
1141 | 1143 |
1142 You can specify additional fontsets using X resources named | 1144 You can specify additional fontsets using X resources named |
1143 @samp{Fontset-@var{n}}, where @var{n} is an integer starting from 0. | 1145 @samp{Fontset-@var{n}}, where @var{n} is an integer starting from 0. |
1144 The resource value should have this form: | 1146 The resource value should have this form: |
1169 | 1171 |
1170 In addition, when several consecutive fields are wildcards, Emacs | 1172 In addition, when several consecutive fields are wildcards, Emacs |
1171 collapses them into a single wildcard. This is to prevent use of | 1173 collapses them into a single wildcard. This is to prevent use of |
1172 auto-scaled fonts. Fonts made by scaling larger fonts are not usable | 1174 auto-scaled fonts. Fonts made by scaling larger fonts are not usable |
1173 for editing, and scaling a smaller font is not useful because it is | 1175 for editing, and scaling a smaller font is not useful because it is |
1174 better to use the smaller font in its own size, which Emacs does. | 1176 better to use the smaller font in its own size, which is what Emacs |
1177 does. | |
1175 | 1178 |
1176 Thus if @var{fontpattern} is this, | 1179 Thus if @var{fontpattern} is this, |
1177 | 1180 |
1178 @example | 1181 @example |
1179 -*-fixed-medium-r-normal-*-24-*-*-*-*-*-fontset-24 | 1182 -*-fixed-medium-r-normal-*-24-*-*-*-*-*-fontset-24 |
1248 @cindex European character sets | 1251 @cindex European character sets |
1249 @cindex accented characters | 1252 @cindex accented characters |
1250 @cindex ISO Latin character sets | 1253 @cindex ISO Latin character sets |
1251 @cindex Unibyte operation | 1254 @cindex Unibyte operation |
1252 The ISO 8859 Latin-@var{n} character sets define character codes in | 1255 The ISO 8859 Latin-@var{n} character sets define character codes in |
1253 the range 160 to 255 to handle the accented letters and punctuation | 1256 the range 0240 to 0377 octal (160 to 255 decimal) to handle the |
1254 needed by various European languages (and some non-European ones). | 1257 accented letters and punctuation needed by various European languages |
1255 If you disable multibyte | 1258 (and some non-European ones). If you disable multibyte characters, |
1256 characters, Emacs can still handle @emph{one} of these character codes | 1259 Emacs can still handle @emph{one} of these character codes at a time. |
1257 at a time. To specify @emph{which} of these codes to use, invoke | 1260 To specify @emph{which} of these codes to use, invoke @kbd{M-x |
1258 @kbd{M-x set-language-environment} and specify a suitable language | 1261 set-language-environment} and specify a suitable language environment |
1259 environment such as @samp{Latin-@var{n}}. | 1262 such as @samp{Latin-@var{n}}. |
1260 | 1263 |
1261 For more information about unibyte operation, see @ref{Enabling | 1264 For more information about unibyte operation, see @ref{Enabling |
1262 Multibyte}. Note particularly that you probably want to ensure that | 1265 Multibyte}. Note particularly that you probably want to ensure that |
1263 your initialization files are read as unibyte if they contain non-ASCII | 1266 your initialization files are read as unibyte if they contain non-ASCII |
1264 characters. | 1267 characters. |
1280 Latin-@var{n} character sets could be implemented, but we don't have | 1283 Latin-@var{n} character sets could be implemented, but we don't have |
1281 them yet. | 1284 them yet. |
1282 | 1285 |
1283 @findex standard-display-8bit | 1286 @findex standard-display-8bit |
1284 @cindex 8-bit display | 1287 @cindex 8-bit display |
1285 Normally non-ISO-8859 characters (between characters 128 and 159 | 1288 Normally non-ISO-8859 characters (decimal codes between 128 and 159 |
1286 inclusive) are displayed as octal escapes. You can change this for | 1289 inclusive) are displayed as octal escapes. You can change this for |
1287 non-standard ``extended'' versions of ISO-8859 character sets by using the | 1290 non-standard ``extended'' versions of ISO-8859 character sets by using the |
1288 function @code{standard-display-8bit} in the @code{disp-table} library. | 1291 function @code{standard-display-8bit} in the @code{disp-table} library. |
1289 | 1292 |
1290 There are several ways you can input single-byte non-ASCII | 1293 There are several ways you can input single-byte non-ASCII |
1291 characters: | 1294 characters: |
1292 | 1295 |
1293 @itemize @bullet | 1296 @itemize @bullet |
1294 @cindex 8-bit input | 1297 @cindex 8-bit input |
1295 @item | 1298 @item |
1296 If your keyboard can generate character codes 128 and up, representing | 1299 If your keyboard can generate character codes 128 (decimal) and up, |
1297 non-ASCII characters, you can type those character codes directly. | 1300 representing non-ASCII characters, you can type those character codes |
1301 directly. | |
1298 | 1302 |
1299 On a windowing terminal, you should not need to do anything special to | 1303 On a windowing terminal, you should not need to do anything special to |
1300 use these keys; they should simply work. On a text-only terminal, you | 1304 use these keys; they should simply work. On a text-only terminal, you |
1301 should use the command @code{M-x set-keyboard-coding-system} or the | 1305 should use the command @code{M-x set-keyboard-coding-system} or the |
1302 Custom option @code{keyboard-coding-system} to specify which coding | 1306 Custom option @code{keyboard-coding-system} to specify which coding |