view etc/charsets/README @ 106395:f2b36fb84bf7

Enhance `c-parse-state' to run efficiently in "brace desserts". * progmodes/cc-mode.el (c-basic-common-init): Call c-state-cache-init. (c-neutralize-syntax-in-and-mark-CPP): Renamed from c-extend-and-neutralize-syntax-in-CPP. Mark each CPP construct by placing `category' properties value 'c-cpp-delimiter at its boundaries. * progmodes/cc-langs.el (c-before-font-lock-function): c-extend-and-neutralize-syntax-in-CPP has been renamed c-neutralize-syntax-in-and-mark-CPP. * progmodes/cc-fonts.el (c-cpp-matchers): Mark template brackets with `category' properties now, not `syntax-table' ones. * progmodes/cc-engine.el (c-syntactic-end-of-macro): A new enhanced (but slower) version of c-end-of-macro that won't land inside a literal or on another awkward character. (c-state-cache-too-far, c-state-cache-start) (c-state-nonlit-pos-interval, c-state-nonlit-pos-cache) (c-state-nonlit-pos-cache-limit, c-state-point-min) (c-state-point-min-lit-type, c-state-point-min-lit-start) (c-state-min-scan-pos, c-state-brace-pair-desert) (c-state-old-cpp-beg, c-state-old-cpp-end): New constants and buffer local variables. (c-state-literal-at, c-state-lit-beg) (c-state-cache-non-literal-place, c-state-get-min-scan-pos) (c-state-mark-point-min-literal, c-state-cache-top-lparen) (c-state-cache-top-paren, c-state-cache-after-top-paren) (c-get-cache-scan-pos, c-get-fallback-scan-pos) (c-state-balance-parens-backwards, c-parse-state-get-strategy) (c-renarrow-state-cache) (c-append-lower-brace-pair-to-state-cache) (c-state-push-any-brace-pair, c-append-to-state-cache) (c-remove-stale-state-cache) (c-remove-stale-state-cache-backwards, c-state-cache-init) (c-invalidate-state-cache-1, c-parse-state-1) (c-invalidate-state-cache): New defuns/defmacros/defsubsts. (c-parse-state): Enhanced and refactored. (c-debug-parse-state): Amended to deal with all the new variables. * progmodes/cc-defs.el (c-<-as-paren-syntax, c-mark-<-as-paren) (c->-as-paren-syntax, c-mark->-as-paren, c-unmark-<->-as-paren): modify to use category text properties rather than syntax-table ones. (c-suppress-<->-as-parens, c-restore-<->-as-parens): new defsubsts to switch off/on the syntactic paren property of C++ template delimiters using the category property. (c-with-<->-as-parens-suppressed): Macro to invoke code with template delims suppressed. (c-cpp-delimiter, c-set-cpp-delimiters, c-clear-cpp-delimiters): New constant/macros which apply category properties to the start and end of preprocessor constructs. (c-comment-out-cpps, c-uncomment-out-cpps): defsubsts which "comment out" the syntactic value of characters in preprocessor constructs. (c-with-cpps-commented-out) (c-with-all-but-one-cpps-commented-out): Macros to invoke code with characters in all or all but one preprocessor constructs "commented out".
author Alan Mackenzie <acm@muc.de>
date Thu, 03 Dec 2009 16:02:10 +0000
parents fe446daa7a49
children 1d1d5d9bd884
line wrap: on
line source

# README file for charset mapping files in this directory.
# Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009
#   National Institute of Advanced Industrial Science and Technology (AIST)
#   Registration Number H13PRO009
# Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009
#   Free Software Foundation, Inc.

# This file is part of GNU Emacs.

# GNU Emacs is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# GNU Emacs is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.

(1) Format of mapping files

Each line contains a code point and the corresponding Unicode
character code separated by a space.  Both code points and Unicode
character codes are in hexadecimal preceded by "0x".  Comments may be
used, starting with "#".  Code ranges may also be used, with
(inclusive) start and end code points separated by "-" followed by the
Unicode of the start of the range

Examples:
0xA0 0x00A0  # no-break space

0x8141-0x8143 0x4E04 # map onto a Unicode range


(2) Source of mapping files

All mapping files are generated automatically from data files freely
available on the Internet (e.g. glibc/localedata/charmaps").  See the
file ../../admin/charsets/mapfiles/README for the detail.