Mercurial > emacs
view etc/charsets/README @ 95397:c99f0a16c077
(CODING_UTF_8_BOM): New macro.
(enum coding_category): Delete coding_category_utf_8, add
coding_category_utf_8_auto, coding_category_utf_8_nosig, and
coding_category_utf_8_sig.
(CATEGORY_MASK_UTF_8): Delete it.
(CATEGORY_MASK_UTF_8_AUTO, CATEGORY_MASK_UTF_8_NOSIG)
(CATEGORY_MASK_UTF_8_SIG): New macros.
(CATEGORY_MASK_ANY): Delete CATEGORY_MASK_UTF_8, add
CATEGORY_MASK_UTF_8_AUTO, CATEGORY_MASK_UTF_8_NOSIG, and
CATEGORY_MASK_UTF_8_SIG.
(CATEGORY_MASK_UTF_8): New macro.
(UTF_BOM, UTF_8_BOM_1, UTF_8_BOM_2, UTF_8_BOM_3): New macros.
(detect_coding_utf_8): Check BOM.
(decode_coding_utf_8, encode_coding_utf_8): Handle BOM.
(decode_coding_utf_16): Adjusted for the change of enum
utf_bom_type.
(encode_coding_utf_16): Likewise.
(setup_coding_system): Likewise. Set CODING_UTF_8_BOM (coding).
(detect_coding, detect_coding_system): Handle utf-8-auto.
(Fdefine_coding_system_internal): Handle `bom' property for utf-8.
(syms_of_coding): Fix setting up of Vcoding_category_table.
author | Kenichi Handa <handa@m17n.org> |
---|---|
date | Thu, 29 May 2008 22:58:15 +0000 |
parents | e9e67a780afd |
children | c90853557b90 |
line wrap: on
line source
# README file for charset mapping files in this directory. # Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008 # National Institute of Advanced Industrial Science and Technology (AIST) # Registration Number H13PRO009 # Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008 # Free Software Foundation, Inc. # This file is part of GNU Emacs. # GNU Emacs is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # GNU Emacs is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # You should have received a copy of the GNU General Public License # along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. (1) Format of mapping files Each line contains a code point and the corresponding Unicode character code separated by a space. Both code points and Unicode character codes are in hexadecimal preceded by "0x". Comments may be used, starting with "#". Code ranges may also be used, with (inclusive) start and end code points separated by "-" followed by the unicode of the start of the range Examples: 0xA0 0x00A0 # no-break space 0x8141-0x8143 0x4E04 # map onto a Unicode range (2) Source of mapping files All mapping files are generated automatically from data files freely available on the Internet (e.g. glibc/localedata/charmaps"). See the file ../../admin/charsets/Makefile for the detail.