Mercurial > emacs
annotate etc/charsets/README @ 95397:c99f0a16c077
(CODING_UTF_8_BOM): New macro.
(enum coding_category): Delete coding_category_utf_8, add
coding_category_utf_8_auto, coding_category_utf_8_nosig, and
coding_category_utf_8_sig.
(CATEGORY_MASK_UTF_8): Delete it.
(CATEGORY_MASK_UTF_8_AUTO, CATEGORY_MASK_UTF_8_NOSIG)
(CATEGORY_MASK_UTF_8_SIG): New macros.
(CATEGORY_MASK_ANY): Delete CATEGORY_MASK_UTF_8, add
CATEGORY_MASK_UTF_8_AUTO, CATEGORY_MASK_UTF_8_NOSIG, and
CATEGORY_MASK_UTF_8_SIG.
(CATEGORY_MASK_UTF_8): New macro.
(UTF_BOM, UTF_8_BOM_1, UTF_8_BOM_2, UTF_8_BOM_3): New macros.
(detect_coding_utf_8): Check BOM.
(decode_coding_utf_8, encode_coding_utf_8): Handle BOM.
(decode_coding_utf_16): Adjusted for the change of enum
utf_bom_type.
(encode_coding_utf_16): Likewise.
(setup_coding_system): Likewise. Set CODING_UTF_8_BOM (coding).
(detect_coding, detect_coding_system): Handle utf-8-auto.
(Fdefine_coding_system_internal): Handle `bom' property for utf-8.
(syms_of_coding): Fix setting up of Vcoding_category_table.
author | Kenichi Handa <handa@m17n.org> |
---|---|
date | Thu, 29 May 2008 22:58:15 +0000 |
parents | e9e67a780afd |
children | c90853557b90 |
rev | line source |
---|---|
88417 | 1 # README file for charset mapping files in this directory. |
91424
4595caa03b6b
Update copyright years and GPL version.
Glenn Morris <rgm@gnu.org>
parents:
89482
diff
changeset
|
2 # Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008 |
88417 | 3 # National Institute of Advanced Industrial Science and Technology (AIST) |
4 # Registration Number H13PRO009 | |
91424
4595caa03b6b
Update copyright years and GPL version.
Glenn Morris <rgm@gnu.org>
parents:
89482
diff
changeset
|
5 # Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008 |
4595caa03b6b
Update copyright years and GPL version.
Glenn Morris <rgm@gnu.org>
parents:
89482
diff
changeset
|
6 # Free Software Foundation, Inc. |
88417 | 7 |
8 # This file is part of GNU Emacs. | |
9 | |
95005
e9e67a780afd
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
91424
diff
changeset
|
10 # GNU Emacs is free software: you can redistribute it and/or modify |
88417 | 11 # it under the terms of the GNU General Public License as published by |
95005
e9e67a780afd
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
91424
diff
changeset
|
12 # the Free Software Foundation, either version 3 of the License, or |
e9e67a780afd
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
91424
diff
changeset
|
13 # (at your option) any later version. |
88417 | 14 |
15 # GNU Emacs is distributed in the hope that it will be useful, | |
16 # but WITHOUT ANY WARRANTY; without even the implied warranty of | |
17 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
18 # GNU General Public License for more details. | |
19 | |
20 # You should have received a copy of the GNU General Public License | |
95005
e9e67a780afd
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
91424
diff
changeset
|
21 # along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. |
88417 | 22 |
23 (1) Format of mapping files | |
24 | |
25 Each line contains a code point and the corresponding Unicode | |
26 character code separated by a space. Both code points and Unicode | |
88536 | 27 character codes are in hexadecimal preceded by "0x". Comments may be |
88689 | 28 used, starting with "#". Code ranges may also be used, with |
29 (inclusive) start and end code points separated by "-" followed by the | |
30 unicode of the start of the range | |
88417 | 31 |
88536 | 32 Examples: |
33 0xA0 0x00A0 # no-break space | |
34 | |
88689 | 35 0x8141-0x8143 0x4E04 # map onto a Unicode range |
88417 | 36 |
37 | |
38 (2) Source of mapping files | |
39 | |
89482 | 40 All mapping files are generated automatically from data files freely |
41 available on the Internet (e.g. glibc/localedata/charmaps"). See the | |
42 file ../../admin/charsets/Makefile for the detail. |