emacs: man/mule.texi comparison

comparison man/mule.texi @ 37584:9a7fd51a92b3

(International): Add an overview of Mule features, with pointers to detailed description. (Enabling Multibyte): Describe how to switch a unibyte session to multibyte. Mention that by default, all sessions are multibyte. (Coding Systems): Make it clear that cpNNN are coding systems, and should be used as such. (Recognize Coding): Explain that Emacs decodes text as part of reading it. Mention revert-buffer as a means to redecode a file.

author	Eli Zaretskii <eliz@gnu.org>
date	Sun, 06 May 2001 11:27:54 +0000
parents	07200bf360ab
children	5a2458f097b0

comparison

equal deleted inserted replaced

-:313d4c5de5ca
+:9a7fd51a92b3
 ``MULti-lingual Enhancement to GNU Emacs'')
 Emacs also supports various encodings of these characters used by
 other internationalized software, such as word processors and mailers.
+Emacs allows editing text with international characters by supporting
+all the related activities:
+@itemize @bullet
+@item
+You can visit files with non-ASCII characters, save non-ASCII text, and
+pass non-ASCII text between Emacs and programs it invokes (such as
+compilers, spell-checkers, and mailers).  Setting your language
+environment (@pxref{Language Environments}) takes care of setting up the
+coding systems and other options for a specific language or culture.
+Alternatively, you can specify how Emacs should encode or decode text
+for each command; see @ref{Specify Coding}.
+@item
+You can display non-ASCII characters encoded by the various scripts.
+This works by using appropriate fonts on X and similar graphics
+displays (@pxref{Defining Fontsets}), and by sending special codes to
+text-only displays (@pxref{Specify Coding}).  If some characters are
+displayed incorrectly, refer to @ref{Undisplayable Characters}, which
+describes possible problems and explains how to solve them.
+@item
+You can insert non-ASCII characters or search for them.  To do that,
+you can specify an input method (@pxref{Select Input Method}) suitable
+for your language, or use the default input method set up when you set
+your language environment.  (Emacs input methods are part of the Leim
+package, which must be installed for you to be able to use them.)  If
+your keyboard can produce non-ASCII characters, you can select an
+appropriate keyboard coding system (@pxref{Specify Coding}), and Emacs
+will accept those characters.  Latin-1 characters can also be input by
+using the @kbd{C-x 8} prefix, see @ref{Single-Byte Character Support,
+C-x 8}.
+@end itemize
+The rest of this chapter describes these issues in detail.
 @menu
 * International Intro::     Basic concepts of multibyte characters.
 * Enabling Multibyte::      Controlling whether to use multibyte characters.
 * Language Environments::   Setting things up for the language you use.
 * Input Methods::           Entering text characters not on your keyboard.
 @end ignore
 @node Enabling Multibyte
 @section Enabling Multibyte Characters
+@cindex turn multibyte support on or off
 You can enable or disable multibyte character support, either for
 Emacs as a whole, or for a single buffer.  When multibyte characters are
 disabled in a buffer, then each byte in that buffer represents a
 character, even codes 0200 through 0377.  The old features for
 supporting the European character sets, ISO Latin-1 and ISO Latin-2,
 However, there is no need to turn off multibyte character support to
 use ISO Latin; the Emacs multibyte character set includes all the
 characters in these character sets, and Emacs can translate
 automatically to and from the ISO codes.
+By default, Emacs starts in multibyte mode, because that allows you to
+use all the supported languages and scripts without limitations.
 To edit a particular file in unibyte representation, visit it using
 @code{find-file-literally}.  @xref{Visiting}.  To convert a buffer in
 multibyte representation into a single-byte representation of the same
 characters, the easiest way is to save the contents in a file, kill the
 @vindex default-enable-multibyte-characters
 To turn off multibyte character support by default, start Emacs with
 the @samp{--unibyte} option (@pxref{Initial Options}), or set the
 environment variable @env{EMACS_UNIBYTE}.  You can also customize
 @code{enable-multibyte-characters} or, equivalently, directly set the
-variable @code{default-enable-multibyte-characters} in your init file to
+variable @code{default-enable-multibyte-characters} to @code{nil} in
-have basically the same effect as @samp{--unibyte}.
+your init file to have basically the same effect as @samp{--unibyte}.
+@findex toggle-enable-multibyte-characters
+To convert a unibyte session to a multibyte session, set
+@code{default-enable-multibyte-characters} to @code{t}.  Buffers which
+were created in the unibyte session before you turn on multibyte support
+will stay unibyte.  You can turn on multibyte support in a specific
+buffer by invoking the command @code{toggle-enable-multibyte-characters}
+in that buffer.
 @cindex Lisp files, and multibyte operation
 @cindex multibyte operation, and Lisp files
 @cindex unibyte operation, and Lisp files
 @cindex init file, and non-ASCII characters
 language name.  Some coding systems are used for several languages;
 their names usually start with @samp{iso}.  There are also special
 coding systems @code{no-conversion}, @code{raw-text} and
 @code{emacs-mule} which do not convert printing characters at all.
+@cindex international files from DOS/Windows systems
 A special class of coding systems, collectively known as
 @dfn{codepages}, is designed to support text encoded by MS-Windows and
 MS-DOS software.  To use any of these systems, you need to create it
-with @kbd{M-x codepage-setup}.  @xref{MS-DOS and MULE}.
+with @kbd{M-x codepage-setup}.  @xref{MS-DOS and MULE}.  After
+creating the coding system for the codepage, you can use it as any
+other coding system.  For example, to visit a file encoded in codepage
+850, type @kbd{C-x @key{RET} c cp850 @key{RET} C-x C-f @var{filename}
+@key{RET}}.
 In addition to converting various representations of non-ASCII
 characters, a coding system can perform end-of-line conversion.  Emacs
 handles three different conventions for how to separate lines in a file:
 newline, carriage-return linefeed, and just carriage-return.
 the usual three variants to specify the kind of end-of-line conversion.
 @node Recognize Coding
 @section Recognizing Coding Systems
-Most of the time, Emacs can recognize which coding system to use for
+Emacs tries to recognize which coding system to use for a given text
-any given file---once you have specified your preferences.
+as an integral part of reading that text.  (This applies to files
+being read, output from subprocesses, text from X selections, etc.)
+Emacs can select the right coding system automatically most of the
+time---once you have specified your preferences.
 Some coding systems can be recognized or distinguished by which byte
 sequences appear in the data.  However, there are coding systems that
 cannot be distinguished, not even potentially.  For example, there is no
 way to distinguish between Latin-1 and Latin-2; they use the same byte
 overrides @samp{-*-coding:-*-} tags in the file itself.  Emacs uses this
 feature for tar and archive files, to prevent Emacs from being confused
 by a @samp{-*-coding:-*-} tag in a member of the archive and thinking it
 applies to the archive file as a whole.
+If Emacs recognizes the encoding of a file incorrectly, you can
+reread the file using the correct coding system by typing @kbd{C-x
+@key{RET} c @var{coding-system} @key{RET} M-x revert-buffer
+@key{RET}}.
 @vindex buffer-file-coding-system
 Once Emacs has chosen a coding system for a buffer, it stores that
 coding system in @code{buffer-file-coding-system} and uses that coding
 system, by default, for operations that write from this buffer into a
 file.  This includes the commands @code{save-buffer} and

Mercurial > emacs

comparison man/mule.texi @ 37584:9a7fd51a92b3