Mercurial > emacs
comparison etc/PROBLEMS @ 90228:fa0da9b57058
Revision: miles@gnu.org--gnu-2005/emacs--unicode--0--patch-82
Merge from emacs--cvs-trunk--0
Patches applied:
* emacs--cvs-trunk--0 (patch 542-553)
- Update from CVS
- Merge from gnus--rel--5.10
* gnus--rel--5.10 (patch 116-121)
- Merge from emacs--cvs-trunk--0
- Update from CVS
author | Miles Bader <miles@gnu.org> |
---|---|
date | Mon, 19 Sep 2005 10:20:33 +0000 |
parents | 10fe5fadaf89 a3cb8f9ce434 |
children | 7beb78bc1f8e |
comparison
equal
deleted
inserted
replaced
90227:10fe5fadaf89 | 90228:fa0da9b57058 |
---|---|
843 mule-unicode-e000-ffff:-gnu-unifont-*-iso10646-1,\ | 843 mule-unicode-e000-ffff:-gnu-unifont-*-iso10646-1,\ |
844 mule-unicode-0100-24ff:-gnu-unifont-*-iso10646-1 | 844 mule-unicode-0100-24ff:-gnu-unifont-*-iso10646-1 |
845 | 845 |
846 ** The UTF-8/16/7 coding systems don't encode CJK (Far Eastern) characters. | 846 ** The UTF-8/16/7 coding systems don't encode CJK (Far Eastern) characters. |
847 | 847 |
848 Emacs by default only supports the parts of the Unicode BMP whose code | 848 Emacs directly supports the Unicode BMP whose code points are in the |
849 points are in the ranges 0000-33ff and e000-ffff. This excludes: most | 849 ranges 0000-33ff and e000-ffff, and indirectly supports the parts of |
850 of CJK, Yi and Hangul, as well as everything outside the BMP. | 850 CJK characters belonging to these legacy charsets: |
851 | |
852 GB2312, Big5, JISX0208, JISX0212, JISX0213-1, JISX0213-2, KSC5601 | |
853 | |
854 The latter support is done in Utf-Translate-Cjk mode (turned on by | |
855 default). Which Unicode CJK characters are decoded into which Emacs | |
856 charset is decided by the current language environment. For instance, | |
857 in Chinese-GB, most of them are decoded into chinese-gb2312. | |
851 | 858 |
852 If you read UTF-8 data with code points outside these ranges, the | 859 If you read UTF-8 data with code points outside these ranges, the |
853 characters appear in the buffer as raw bytes of the original UTF-8 | 860 characters appear in the buffer as raw bytes of the original UTF-8 |
854 (composed into a single quasi-character) and they will be written back | 861 (composed into a single quasi-character) and they will be written back |
855 correctly as UTF-8, assuming you don't break the composed sequences. | 862 correctly as UTF-8, assuming you don't break the composed sequences. |
856 If you read such characters from UTF-16 or UTF-7 data, they are | 863 If you read such characters from UTF-16 or UTF-7 data, they are |
857 substituted with the Unicode `replacement character', and you lose | 864 substituted with the Unicode `replacement character', and you lose |
858 information. | 865 information. |
859 | |
860 To edit such UTF data, turn on Utf-Translate-Cjk mode, which makes | |
861 many common CJK characters available for encoding and decoding and can | |
862 be extended by updating the tables it uses. This also allows you to | |
863 save as UTF buffers containing characters decoded by the chinese-, | |
864 japanese- and korean- coding systems, e.g. cut and pasted from | |
865 elsewhere. | |
866 | 866 |
867 ** Mule-UCS loads very slowly. | 867 ** Mule-UCS loads very slowly. |
868 | 868 |
869 Changes to Emacs internals interact badly with Mule-UCS's `un-define' | 869 Changes to Emacs internals interact badly with Mule-UCS's `un-define' |
870 library, which is the usual interface to Mule-UCS. Apply the | 870 library, which is the usual interface to Mule-UCS. Apply the |