annotate lisp/international/ucs-normalize.el @ 110410:f2e111723c3a

Merge changes made in Gnus trunk. Reimplement nnimap, and do tweaks to the rest of the code to support that. * gnus-int.el (gnus-finish-retrieve-group-infos) (gnus-retrieve-group-data-early): New functions. * gnus-range.el (gnus-range-nconcat): New function. * gnus-start.el (gnus-get-unread-articles): Support early retrieval of data. (gnus-read-active-for-groups): Support finishing the early retrieval of data. * gnus-sum.el (gnus-summary-move-article): Pass the move-to group name if the move is internal, so that nnimap can do fast internal moves. * gnus.el (gnus-article-special-mark-lists): Add uid/active tuples, for nnimap usage. * nnimap.el: Rewritten. * nnmail.el (nnmail-inhibit-default-split-group): New internal variable to allow the mail splitting to not return a default group. This is useful for nnimap, which will leave unmatched mail in the inbox. * utf7.el (utf7-encode): Autoload. Implement shell connection. * nnimap.el (nnimap-open-shell-stream): New function. (nnimap-open-connection): Use it. Get the number of lines by using BODYSTRUCTURE. (nnimap-transform-headers): Get the number of lines in each message. (nnimap-retrieve-headers): Query for BODYSTRUCTURE so that we get the number of lines. Not all servers return UIDNEXT. Work past this problem. Remove junk from end of file. Fix typo in "bogus" section. Make capabilties be case-insensitive. Require cl when compiling. Don't bug out if the LIST command doesn't have any parameters. 2010-09-17 Knut Anders Hatlen <kahatlen@gmail.com> (tiny change) * nnimap.el (nnimap-get-groups): Don't bug out if the LIST command doesn't have any parameters. (mm-text-html-renderer): Document gnus-article-html. 2010-09-17 Julien Danjou <julien@danjou.info> (tiny fix) * mm-decode.el (mm-text-html-renderer): Document gnus-article-html. * dgnushack.el: Define netrc-credentials. If the user doesn't have a /etc/services, supply some sensible port defaults. Have `unseen-or-unread' select an unread unseen article first. (nntp-open-server): Return whether the open was successful or not. Throughout all files, replace (save-excursion (set-buffer ...)) with (with-current-buffer ... ). Save result so that it doesn't say "failed" all the time. Add ~/.authinfo to the default, since that's probably most useful for users. Don't use the "finish" method when we're reading from the agent. Add some more nnimap-relevant agent stuff to nnagent.el. * nnimap.el (nnimap-with-process-buffer): Removed. Revert one line that was changed by mistake in the last checkin. (nnimap-open-connection): Don't error out when we can't make a connection nnimap-related changes to avoid bugging out if we can't contact a server. * gnus-start.el (gnus-get-unread-articles): Don't try to scan groups from methods that are denied. * nnimap.el (nnimap-possibly-change-group): Return nil if we can't log in. (nnimap-finish-retrieve-group-infos): Make sure we're not waiting for nothing. * gnus-sum.el (gnus-select-newsgroup): Indent.
author Katsumi Yamaoka <yamaoka@jpl.org>
date Sat, 18 Sep 2010 10:02:19 +0000
parents 4d54e23aa31e
children 417b1e4d63cd
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
1 ;;; ucs-normalize.el --- Unicode normalization NFC/NFD/NFKD/NFKC
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
2
106815
1d1d5d9bd884 Add 2010 to copyright years.
Glenn Morris <rgm@gnu.org>
parents: 105620
diff changeset
3 ;; Copyright (C) 2009, 2010
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
4 ;; Free Software Foundation, Inc.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
5
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
6 ;; Author: Taichi Kawabata <kawabata.taichi@gmail.com>
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
7 ;; Keywords: unicode, normalization
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
8
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
9 ;; This file is part of GNU Emacs.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
10
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
11 ;; GNU Emacs is free software: you can redistribute it and/or modify
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
12 ;; it under the terms of the GNU General Public License as published by
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
13 ;; the Free Software Foundation, either version 3 of the License, or
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
14 ;; (at your option) any later version.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
15
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
16 ;; GNU Emacs is distributed in the hope that it will be useful,
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
17 ;; but WITHOUT ANY WARRANTY; without even the implied warranty of
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
18 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
19 ;; GNU General Public License for more details.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
20
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
21 ;; You should have received a copy of the GNU General Public License
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
22 ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
23
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
24 ;;; Commentary:
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
25 ;;
105620
4b7680ee254c (ucs-normalize-version): Changed to 1.2.
Kenichi Handa <handa@m17n.org>
parents: 104683
diff changeset
26 ;; This program has passed the NormalizationTest-5.2.0.txt.
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
27 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
28 ;; References:
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
29 ;; http://www.unicode.org/reports/tr15/
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
30 ;; http://www.unicode.org/review/pr-29.html
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
31 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
32 ;; HFS-Normalization:
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
33 ;; Reference:
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
34 ;; http://developer.apple.com/technotes/tn/tn1150.html
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
35 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
36 ;; HFS Normalization excludes following area for decomposition.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
37 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
38 ;; U+02000 .. U+02FFF :: Punctuation, symbols, dingbats, arrows, etc.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
39 ;; (Characters in this region will be composed.)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
40 ;; U+0F900 .. U+0FAFF :: CJK compatibility Ideographs.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
41 ;; U+2F800 .. U+2FFFF :: CJK compatibility Ideographs.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
42 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
43 ;; HFS-Normalization is useful for normalizing text involving CJK Ideographs.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
44 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
45 ;;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
46 ;;; Implementation Notes on NFC/HFS-NFC.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
47 ;;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
48 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
49 ;; <Stages> Decomposition Composition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
50 ;; NFD: 'nfd nil
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
51 ;; NFC: 'nfd t
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
52 ;; NFKD: 'nfkd nil
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
53 ;; NFKC: 'nfkd t
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
54 ;; HFS-NFD: 'hfs-nfd 'hfs-nfd-comp-p
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
55 ;; HFS-NFC: 'hfs-nfd t
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
56 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
57 ;; Algorithm for Normalization
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
58 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
59 ;; Before normalization, following data will be prepared.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
60 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
61 ;; 1. quick-check-list
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
62 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
63 ;; `quick-check-list' consists of characters that will be decomposed
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
64 ;; during normalization. It includes composition-exclusions,
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
65 ;; singletons, non-starter-decompositions and decomposable
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
66 ;; characters.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
67 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
68 ;; `quick-check-regexp' will search the above characters plus
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
69 ;; combining characters.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
70 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
71 ;; 2. decomposition-translation
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
72 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
73 ;; `decomposition-translation' is a translation table that will be
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
74 ;; used to decompose the characters.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
75 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
76 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
77 ;; Normalization Process
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
78 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
79 ;; A. Searching (`ucs-normalize-region')
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
80 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
81 ;; Region is searched for `quick-check-regexp' to find possibly
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
82 ;; normalizable point.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
83 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
84 ;; B. Identification of Normalization Block
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
85 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
86 ;; (1) start of the block
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
87 ;; If the searched character is a starter and not combining
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
88 ;; with previous character, then the beginning of the block is
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
89 ;; the searched character. If searched character is combining
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
90 ;; character, then previous character will be the target
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
91 ;; character
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
92 ;; (2) end of the block
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
93 ;; Block ends at non-composable starter character.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
94 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
95 ;; C. Decomposition (`ucs-normalize-block')
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
96 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
97 ;; The entire block will be decomposed by
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
98 ;; `decomposition-translation' table.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
99 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
100 ;; D. Sorting and Composition of Smaller Blocks (`ucs-normalize-block-compose-chars')
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
101 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
102 ;; The block will be split to multiple samller blocks by starter
110361
4d54e23aa31e Fix typos in comments and ChangeLogs.
Juanma Barranquero <lekktu@gmail.com>
parents: 106815
diff changeset
103 ;; characters. Each block is sorted, and composed if necessary.
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
104 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
105 ;; E. Composition of Entire Block (`ucs-normalize-compose-chars')
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
106 ;;
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
107 ;; Composed blocks are collected and again composed.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
108
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
109 ;;; Code:
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
110
105620
4b7680ee254c (ucs-normalize-version): Changed to 1.2.
Kenichi Handa <handa@m17n.org>
parents: 104683
diff changeset
111 (defconst ucs-normalize-version "1.2")
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
112
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
113 (eval-when-compile (require 'cl))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
114
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
115 (declare-function nfd "ucs-normalize" (char))
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
116
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
117 (eval-when-compile
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
118
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
119 (defconst ucs-normalize-composition-exclusions
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
120 '(#x0958 #x0959 #x095A #x095B #x095C #x095D #x095E #x095F
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
121 #x09DC #x09DD #x09DF #x0A33 #x0A36 #x0A59 #x0A5A #x0A5B
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
122 #x0A5E #x0B5C #x0B5D #x0F43 #x0F4D #x0F52 #x0F57 #x0F5C
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
123 #x0F69 #x0F76 #x0F78 #x0F93 #x0F9D #x0FA2 #x0FA7 #x0FAC
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
124 #x0FB9 #xFB1D #xFB1F #xFB2A #xFB2B #xFB2C #xFB2D #xFB2E
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
125 #xFB2F #xFB30 #xFB31 #xFB32 #xFB33 #xFB34 #xFB35 #xFB36
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
126 #xFB38 #xFB39 #xFB3A #xFB3B #xFB3C #xFB3E #xFB40 #xFB41
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
127 #xFB43 #xFB44 #xFB46 #xFB47 #xFB48 #xFB49 #xFB4A #xFB4B
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
128 #xFB4C #xFB4D #xFB4E #x2ADC #x1D15E #x1D15F #x1D160 #x1D161
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
129 #x1D162 #x1D163 #x1D164 #x1D1BB #x1D1BC #x1D1BD #x1D1BE
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
130 #x1D1BF #x1D1C0)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
131 "Composition Exclusion List.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
132 This list is taken from
105620
4b7680ee254c (ucs-normalize-version): Changed to 1.2.
Kenichi Handa <handa@m17n.org>
parents: 104683
diff changeset
133 http://www.unicode.org/Public/UNIDATA/5.2/CompositionExclusions.txt")
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
134
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
135 ;; Unicode ranges that decompositions & combinings are defined.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
136 (defvar check-range nil)
105620
4b7680ee254c (ucs-normalize-version): Changed to 1.2.
Kenichi Handa <handa@m17n.org>
parents: 104683
diff changeset
137 (setq check-range '((#x00a0 . #x3400) (#xA600 . #xAC00) (#xF900 . #x110ff) (#x1d000 . #x1dfff) (#x1f100 . #x1f2ff) (#x2f800 . #x2faff)))
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
138
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
139 ;; Basic normalization functions
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
140 (defun nfd (char)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
141 (let ((decomposition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
142 (get-char-code-property char 'decomposition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
143 (if (and decomposition (numberp (car decomposition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
144 decomposition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
145
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
146 (defun nfkd (char)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
147 (let ((decomposition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
148 (get-char-code-property char 'decomposition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
149 (if (symbolp (car decomposition)) (cdr decomposition)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
150 decomposition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
151
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
152 (defun hfs-nfd (char)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
153 (when (or (and (>= char 0) (< char #x2000))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
154 (and (>= char #x3000) (< char #xf900))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
155 (and (>= char #xfb00) (< char #x2f800))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
156 (>= char #x30000))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
157 (nfd char))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
158
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
159 (eval-and-compile
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
160 (defun ucs-normalize-hfs-nfd-comp-p (char)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
161 (and (>= char #x2000) (< char #x3000)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
162
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
163 (defsubst ucs-normalize-ccc (char)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
164 (get-char-code-property char 'canonical-combining-class))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
165 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
166
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
167 ;; Data common to all normalizations
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
168
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
169 (eval-when-compile
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
170
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
171 (defvar combining-chars nil)
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
172 (setq combining-chars nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
173 (defvar decomposition-pair-to-composition nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
174 (setq decomposition-pair-to-composition nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
175 (defvar non-starter-decompositions nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
176 (setq non-starter-decompositions nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
177 (let ((char 0) ccc decomposition)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
178 (mapc
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
179 (lambda (start-end)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
180 (do ((char (car start-end) (+ char 1))) ((> char (cdr start-end)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
181 (setq ccc (ucs-normalize-ccc char))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
182 (setq decomposition (get-char-code-property
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
183 char 'decomposition))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
184 (if (and ccc (/= 0 ccc)) (add-to-list 'combining-chars char))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
185 (if (and (numberp (car decomposition))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
186 (/= (ucs-normalize-ccc (car decomposition))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
187 0))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
188 (add-to-list 'non-starter-decompositions char))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
189 (when (numberp (car decomposition))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
190 (if (and (= 2 (length decomposition))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
191 (null (memq char ucs-normalize-composition-exclusions))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
192 (null (memq char non-starter-decompositions)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
193 (setq decomposition-pair-to-composition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
194 (cons (cons decomposition char)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
195 decomposition-pair-to-composition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
196 ;; If not singleton decomposition, second and later characters in
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
197 ;; decomposition will be the subject of combining characters.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
198 (if (cdr decomposition)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
199 (dolist (char (cdr decomposition))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
200 (add-to-list 'combining-chars char))))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
201 check-range))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
202
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
203 (setq combining-chars
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
204 (append combining-chars
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
205 '(?ᅡ ?ᅢ ?ᅣ ?ᅤ ?ᅥ ?ᅦ ?ᅧ ?ᅨ ?ᅩ ?ᅪ
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
206 ?ᅫ ?ᅬ ?ᅭ ?ᅮ ?ᅯ ?ᅰ ?ᅱ ?ᅲ ?ᅳ ?ᅴ ?ᅵ
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
207 ?ᆨ ?ᆩ ?ᆪ ?ᆫ ?ᆬ ?ᆭ ?ᆮ ?ᆯ ?ᆰ ?ᆱ ?ᆲ ?ᆳ ?ᆴ
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
208 ?ᆵ ?ᆶ ?ᆷ ?ᆸ ?ᆹ ?ᆺ ?ᆻ ?ᆼ ?ᆽ ?ᆾ ?ᆿ ?ᇀ ?ᇁ ?ᇂ)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
209 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
210
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
211 (eval-and-compile
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
212 (defun ucs-normalize-make-hash-table-from-alist (alist)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
213 (let ((table (make-hash-table :test 'equal :size 2000)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
214 (mapc (lambda (x) (puthash (car x) (cdr x) table)) alist)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
215 table))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
216
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
217 (defvar ucs-normalize-decomposition-pair-to-primary-composite nil
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
218 "Hashtable of decomposed pair to primary composite.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
219 Note that Hangul are excluded.")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
220 (setq ucs-normalize-decomposition-pair-to-primary-composite
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
221 (ucs-normalize-make-hash-table-from-alist
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
222 (eval-when-compile decomposition-pair-to-composition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
223
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
224 (defun ucs-normalize-primary-composite (decomposition-pair composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
225 "Convert DECOMPOSITION-PAIR to primay composite using COMPOSITION-PREDICATE."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
226 (let ((char (or (gethash decomposition-pair
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
227 ucs-normalize-decomposition-pair-to-primary-composite)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
228 (and (<= #x1100 (car decomposition-pair))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
229 (< (car decomposition-pair) #x1113)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
230 (<= #x1161 (cadr decomposition-pair))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
231 (< (car decomposition-pair) #x1176)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
232 (let ((lindex (- (car decomposition-pair) #x1100))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
233 (vindex (- (cadr decomposition-pair) #x1161)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
234 (+ #xAC00 (* (+ (* lindex 21) vindex) 28))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
235 (and (<= #xac00 (car decomposition-pair))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
236 (< (car decomposition-pair) #xd7a4)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
237 (<= #x11a7 (cadr decomposition-pair))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
238 (< (cadr decomposition-pair) #x11c3)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
239 (= 0 (% (- (car decomposition-pair) #xac00) 28))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
240 (let ((tindex (- (cadr decomposition-pair) #x11a7)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
241 (+ (car decomposition-pair) tindex))))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
242 (if (and char
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
243 (functionp composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
244 (null (funcall composition-predicate char)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
245 nil char)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
246 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
247
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
248 (defvar ucs-normalize-combining-chars nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
249 (setq ucs-normalize-combining-chars (eval-when-compile combining-chars))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
251 (defvar ucs-normalize-combining-chars-regexp nil
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
252 "Regular expression to match sequence of combining characters.")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
253 (setq ucs-normalize-combining-chars-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
254 (eval-when-compile (concat (regexp-opt (mapcar 'char-to-string combining-chars)) "+")))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
255
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
256 (declare-function decomposition-translation-alist "ucs-normalize"
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
257 (decomposition-function))
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
258 (declare-function decomposition-char-recursively "ucs-normalize"
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
259 (char decomposition-function))
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
260 (declare-function alist-list-to-vector "ucs-normalize" (alist))
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
261
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
262 (eval-when-compile
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
263
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
264 (defun decomposition-translation-alist (decomposition-function)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
265 (let (decomposition alist)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
266 (mapc
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
267 (lambda (start-end)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
268 (do ((char (car start-end) (+ char 1))) ((> char (cdr start-end)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
269 (setq decomposition (funcall decomposition-function char))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
270 (if decomposition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
271 (setq alist (cons (cons char
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
272 (apply 'append
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
273 (mapcar (lambda (x)
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
274 (decomposition-char-recursively
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
275 x decomposition-function))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
276 decomposition)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
277 alist)))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
278 check-range)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
279 alist))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
280
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
281 (defun decomposition-char-recursively (char decomposition-function)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
282 (let ((decomposition (funcall decomposition-function char)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
283 (if decomposition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
284 (apply 'append
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
285 (mapcar (lambda (x)
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
286 (decomposition-char-recursively x decomposition-function))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
287 decomposition))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
288 (list char))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
289
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
290 (defun alist-list-to-vector (alist)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
291 (mapcar (lambda (x) (cons (car x) (apply 'vector (cdr x)))) alist))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
292
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
293 (defvar nfd-alist nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
294 (setq nfd-alist (alist-list-to-vector (decomposition-translation-alist 'nfd)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
295 (defvar nfkd-alist nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
296 (setq nfkd-alist (alist-list-to-vector (decomposition-translation-alist 'nfkd)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
297 (defvar hfs-nfd-alist nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
298 (setq hfs-nfd-alist (alist-list-to-vector (decomposition-translation-alist 'hfs-nfd)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
299 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
300
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
301 (eval-and-compile
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
302 (defvar ucs-normalize-hangul-translation-alist nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
303 (setq ucs-normalize-hangul-translation-alist
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
304 (let ((i 0) entries)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
305 (while (< i 11172)
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
306 (setq entries
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
307 (cons (cons (+ #xac00 i)
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
308 (if (= 0 (% i 28))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
309 (vector (+ #x1100 (/ i 588))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
310 (+ #x1161 (/ (% i 588) 28)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
311 (vector (+ #x1100 (/ i 588))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
312 (+ #x1161 (/ (% i 588) 28))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
313 (+ #x11a7 (% i 28)))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
314 entries)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
315 i (1+ i))) entries))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
316
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
317 (defun ucs-normalize-make-translation-table-from-alist (alist)
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
318 (make-translation-table-from-alist
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
319 (append alist ucs-normalize-hangul-translation-alist)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
320
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
321 (define-translation-table 'ucs-normalize-nfd-table
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
322 (ucs-normalize-make-translation-table-from-alist (eval-when-compile nfd-alist)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
323 (define-translation-table 'ucs-normalize-nfkd-table
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
324 (ucs-normalize-make-translation-table-from-alist (eval-when-compile nfkd-alist)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
325 (define-translation-table 'ucs-normalize-hfs-nfd-table
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
326 (ucs-normalize-make-translation-table-from-alist (eval-when-compile hfs-nfd-alist)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
327
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
328 (defun ucs-normalize-sort (chars)
104683
2b8eeeaa8c1d * international/ucs-normalize.el (ucs-normalize-sort, quick-check-list):
Juanma Barranquero <lekktu@gmail.com>
parents: 104540
diff changeset
329 "Sort by canonical combining class of CHARS."
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
330 (sort chars
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
331 (lambda (ch1 ch2)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
332 (< (ucs-normalize-ccc ch1) (ucs-normalize-ccc ch2)))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
333
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
334 (defun ucs-normalize-compose-chars (chars composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
335 "Compose CHARS by COMPOSITION-PREDICATE.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
336 CHARS must be sorted and normalized in starter-combining pairs."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
337 (if composition-predicate
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
338 (let* ((starter (car chars))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
339 remain result prev-ccc
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
340 (target-chars (cdr chars))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
341 target target-ccc
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
342 primary-composite)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
343 (while target-chars
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
344 (setq target (car target-chars)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
345 target-ccc (ucs-normalize-ccc target))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
346 (if (and (or (null prev-ccc)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
347 (< prev-ccc target-ccc))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
348 (setq primary-composite
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
349 (ucs-normalize-primary-composite (list starter target)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
350 composition-predicate)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
351 ;; case 1: composable
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
352 (setq starter primary-composite
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
353 prev-ccc nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
354 (if (= 0 target-ccc)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
355 ;; case 2: move starter
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
356 (setq result (nconc result (cons starter (nreverse remain)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
357 starter target
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
358 remain nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
359 ;; case 3: move target
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
360 (setq prev-ccc target-ccc
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
361 remain (cons target remain))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
362 (setq target-chars (cdr target-chars)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
363 (nconc result (cons starter (nreverse remain))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
364 chars))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
365
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
366 (defun ucs-normalize-block-compose-chars (chars composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
367 "Try composing CHARS by COMPOSITION-PREDICATE.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
368 If COMPOSITION-PREDICATE is not given, then do nothing."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
369 (let ((chars (ucs-normalize-sort chars)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
370 (if composition-predicate
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
371 (ucs-normalize-compose-chars chars composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
372 chars)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
373 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
374
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
375 (declare-function quick-check-list "ucs-normalize"
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
376 (decomposition-translation &optional composition-predicate))
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
377 (declare-function quick-check-list-to-regexp "ucs-normalize" (quick-check-list))
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
378
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
379 (eval-when-compile
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
380
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
381 (defun quick-check-list (decomposition-translation
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
382 &optional composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
383 "Quick-Check List for DECOMPOSITION-TRANSLATION and COMPOSITION-PREDICATE.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
384 It includes Singletons, CompositionExclusions, and Non-Starter
104683
2b8eeeaa8c1d * international/ucs-normalize.el (ucs-normalize-sort, quick-check-list):
Juanma Barranquero <lekktu@gmail.com>
parents: 104540
diff changeset
385 decomposition."
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
386 (let (entries decomposition composition)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
387 (mapc
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
388 (lambda (start-end)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
389 (do ((i (car start-end) (+ i 1))) ((> i (cdr start-end)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
390 (setq decomposition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
391 (string-to-list
104540
fc0ed6b4a2b2 (nfd, decomposition-translation-alist, decomposition-char-recursively)
Glenn Morris <rgm@gnu.org>
parents: 104343
diff changeset
392 (with-temp-buffer
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
393 (insert i)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
394 (translate-region 1 2 decomposition-translation)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
395 (buffer-string))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
396 (setq composition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
397 (ucs-normalize-block-compose-chars decomposition composition-predicate))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
398 (when (not (equal composition (list i)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
399 (setq entries (cons i entries)))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
400 check-range)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
401 ;;(remove-duplicates
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
402 (append entries
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
403 ucs-normalize-composition-exclusions
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
404 non-starter-decompositions)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
405 ;;)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
406
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
407 (defvar nfd-quick-check-list nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
408 (setq nfd-quick-check-list (quick-check-list 'ucs-normalize-nfd-table ))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
409 (defvar nfc-quick-check-list nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
410 (setq nfc-quick-check-list (quick-check-list 'ucs-normalize-nfd-table t ))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
411 (defvar nfkd-quick-check-list nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
412 (setq nfkd-quick-check-list (quick-check-list 'ucs-normalize-nfkd-table ))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
413 (defvar nfkc-quick-check-list nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
414 (setq nfkc-quick-check-list (quick-check-list 'ucs-normalize-nfkd-table t ))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
415 (defvar hfs-nfd-quick-check-list nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
416 (setq hfs-nfd-quick-check-list (quick-check-list 'ucs-normalize-hfs-nfd-table
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
417 'ucs-normalize-hfs-nfd-comp-p))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
418 (defvar hfs-nfc-quick-check-list nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
419 (setq hfs-nfc-quick-check-list (quick-check-list 'ucs-normalize-hfs-nfd-table t ))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
420
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
421 (defun quick-check-list-to-regexp (quick-check-list)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
422 (regexp-opt (mapcar 'char-to-string (append quick-check-list combining-chars))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
423
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
424 (defun quick-check-decomposition-list-to-regexp (quick-check-list)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
425 (concat (quick-check-list-to-regexp quick-check-list) "\\|[가-힣]"))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
426
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
427 (defun quick-check-composition-list-to-regexp (quick-check-list)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
428 (concat (quick-check-list-to-regexp quick-check-list) "\\|[ᅡ-ᅵᆨ-ᇂ]"))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
429 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
430
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
431
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
432 ;; NFD/NFC
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
433 (defvar ucs-normalize-nfd-quick-check-regexp nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
434 (setq ucs-normalize-nfd-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
435 (eval-when-compile (quick-check-decomposition-list-to-regexp nfd-quick-check-list)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
436 (defvar ucs-normalize-nfc-quick-check-regexp nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
437 (setq ucs-normalize-nfc-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
438 (eval-when-compile (quick-check-composition-list-to-regexp nfc-quick-check-list)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
439
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
440 ;; NFKD/NFKC
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
441 (defvar ucs-normalize-nfkd-quick-check-regexp nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
442 (setq ucs-normalize-nfkd-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
443 (eval-when-compile (quick-check-decomposition-list-to-regexp nfkd-quick-check-list)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
444 (defvar ucs-normalize-nfkc-quick-check-regexp nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
445 (setq ucs-normalize-nfkc-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
446 (eval-when-compile (quick-check-composition-list-to-regexp nfkc-quick-check-list)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
447
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
448 ;; HFS-NFD/HFS-NFC
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
449 (defvar ucs-normalize-hfs-nfd-quick-check-regexp nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
450 (setq ucs-normalize-hfs-nfd-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
451 (eval-when-compile (concat (quick-check-decomposition-list-to-regexp hfs-nfd-quick-check-list))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
452 (defvar ucs-normalize-hfs-nfc-quick-check-regexp nil)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
453 (setq ucs-normalize-hfs-nfc-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
454 (eval-when-compile (quick-check-composition-list-to-regexp hfs-nfc-quick-check-list)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
455
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
456 ;;------------------------------------------------------------------------------------------
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
457
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
458 ;; Normalize local region.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
459
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
460 (defun ucs-normalize-block
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
461 (from to &optional decomposition-translation-table composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
462 "Normalize region FROM TO, by sorting the region with canonical-cc.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
463 If DECOMPOSITION-TRANSLATION-TABLE is given, translate region
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
464 before sorting. If COMPOSITION-PREDICATE is given, then compose
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
465 the region by using it."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
466 (save-restriction
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
467 (narrow-to-region from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
468 (goto-char (point-min))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
469 (if decomposition-translation-table
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
470 (translate-region from to decomposition-translation-table))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
471 (goto-char (point-min))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
472 (let ((start (point)) chars); ccc)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
473 (while (not (eobp))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
474 (forward-char)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
475 (when (or (eobp)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
476 (= 0 (ucs-normalize-ccc (char-after (point)))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
477 (setq chars
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
478 (nconc chars
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
479 (ucs-normalize-block-compose-chars
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
480 (string-to-list (buffer-substring start (point)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
481 composition-predicate))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
482 start (point)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
483 ;;(unless ccc (error "Undefined character can not be normalized!"))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
484 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
485 (delete-region (point-min) (point-max))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
486 (apply 'insert
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
487 (ucs-normalize-compose-chars
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
488 chars composition-predicate)))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
489
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
490 (defun ucs-normalize-region
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
491 (from to quick-check-regexp translation-table composition-predicate)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
492 "Normalize region from FROM to TO.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
493 QUICK-CHECK-REGEXP is applied for searching the region.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
494 TRANSLATION-TABLE will be used to decompose region.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
495 COMPOSITION-PREDICATE will be used to compose region."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
496 (save-excursion
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
497 (save-restriction
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
498 (narrow-to-region from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
499 (goto-char (point-min))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
500 (let (start-pos starter)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
501 (while (re-search-forward quick-check-regexp nil t)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
502 (setq starter (string-to-char (match-string 0)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
503 (setq start-pos (match-beginning 0))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
504 (ucs-normalize-block
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
505 ;; from
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
506 (if (or (= start-pos (point-min))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
507 (and (= 0 (ucs-normalize-ccc starter))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
508 (not (memq starter ucs-normalize-combining-chars))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
509 start-pos (1- start-pos))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
510 ;; to
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
511 (if (looking-at ucs-normalize-combining-chars-regexp)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
512 (match-end 0) (1+ start-pos))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
513 translation-table composition-predicate))))))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
514
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
515 ;; --------------------------------------------------------------------------------
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
516
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
517 (defmacro ucs-normalize-string (ucs-normalize-region)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
518 `(with-temp-buffer
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
519 (insert str)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
520 (,ucs-normalize-region (point-min) (point-max))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
521 (buffer-string)))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
522
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
523 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
524 (defun ucs-normalize-NFD-region (from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
525 "Normalize the current region by the Unicode NFD."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
526 (interactive "r")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
527 (ucs-normalize-region from to
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
528 ucs-normalize-nfd-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
529 'ucs-normalize-nfd-table nil))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
530 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
531 (defun ucs-normalize-NFD-string (str)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
532 "Normalize the string STR by the Unicode NFD."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
533 (ucs-normalize-string ucs-normalize-NFD-region))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
534
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
535 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
536 (defun ucs-normalize-NFC-region (from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
537 "Normalize the current region by the Unicode NFC."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
538 (interactive "r")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
539 (ucs-normalize-region from to
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
540 ucs-normalize-nfc-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
541 'ucs-normalize-nfd-table t))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
542 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
543 (defun ucs-normalize-NFC-string (str)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
544 "Normalize the string STR by the Unicode NFC."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
545 (ucs-normalize-string ucs-normalize-NFC-region))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
546
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
547 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
548 (defun ucs-normalize-NFKD-region (from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
549 "Normalize the current region by the Unicode NFKD."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
550 (interactive "r")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
551 (ucs-normalize-region from to
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
552 ucs-normalize-nfkd-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
553 'ucs-normalize-nfkd-table nil))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
554 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
555 (defun ucs-normalize-NFKD-string (str)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
556 "Normalize the string STR by the Unicode NFKD."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
557 (ucs-normalize-string ucs-normalize-NFKD-region))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
558
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
559 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
560 (defun ucs-normalize-NFKC-region (from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
561 "Normalize the current region by the Unicode NFKC."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
562 (interactive "r")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
563 (ucs-normalize-region from to
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
564 ucs-normalize-nfkc-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
565 'ucs-normalize-nfkd-table t))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
566 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
567 (defun ucs-normalize-NFKC-string (str)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
568 "Normalize the string STR by the Unicode NFKC."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
569 (ucs-normalize-string ucs-normalize-NFKC-region))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
570
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
571 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
572 (defun ucs-normalize-HFS-NFD-region (from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
573 "Normalize the current region by the Unicode NFD and Mac OS's HFS Plus."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
574 (interactive "r")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
575 (ucs-normalize-region from to
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
576 ucs-normalize-hfs-nfd-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
577 'ucs-normalize-hfs-nfd-table
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
578 'ucs-normalize-hfs-nfd-comp-p))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
579 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
580 (defun ucs-normalize-HFS-NFD-string (str)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
581 "Normalize the string STR by the Unicode NFD and Mac OS's HFS Plus."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
582 (ucs-normalize-string ucs-normalize-HFS-NFD-region))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
583 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
584 (defun ucs-normalize-HFS-NFC-region (from to)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
585 "Normalize the current region by the Unicode NFC and Mac OS's HFS Plus."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
586 (interactive "r")
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
587 (ucs-normalize-region from to
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
588 ucs-normalize-hfs-nfc-quick-check-regexp
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
589 'ucs-normalize-hfs-nfd-table t))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
590 ;;;###autoload
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
591 (defun ucs-normalize-HFS-NFC-string (str)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
592 "Normalize the string STR by the Unicode NFC and Mac OS's HFS Plus."
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
593 (ucs-normalize-string ucs-normalize-HFS-NFC-region))
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
594
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
595 ;; Post-read-conversion function for `utf-8-hfs'.
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
596 (defun ucs-normalize-hfs-nfd-post-read-conversion (len)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
597 (save-excursion
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
598 (save-restriction
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
599 (narrow-to-region (point) (+ (point) len))
104343
396aecca2f45 (ucs-normalize-hfs-nfd-post-read-conversion):
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 104328
diff changeset
600 (ucs-normalize-HFS-NFC-region (point-min) (point-max))
396aecca2f45 (ucs-normalize-hfs-nfd-post-read-conversion):
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 104328
diff changeset
601 (- (point-max) (point-min)))))
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
602
104328
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
603 ;; Pre-write conversion for `utf-8-hfs'.
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
604 (defun ucs-normalize-hfs-nfd-pre-write-conversion (from to)
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
605 (let ((old-buf (current-buffer)))
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
606 (set-buffer (generate-new-buffer " *temp*"))
104683
2b8eeeaa8c1d * international/ucs-normalize.el (ucs-normalize-sort, quick-check-list):
Juanma Barranquero <lekktu@gmail.com>
parents: 104540
diff changeset
607 (if (stringp from)
104328
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
608 (insert from)
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
609 (insert-buffer-substring old-buf from to))
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
610 (ucs-normalize-HFS-NFD-region (point-min) (point-max))
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
611 nil))
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
612
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
613 ;;; coding-system definition
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
614 (define-coding-system 'utf-8-hfs
104328
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
615 "UTF-8 based coding system for MacOS HFS file names.
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
616 The singleton characters in HFS normalization exclusion will not
104328
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
617 be decomposed."
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
618 :coding-type 'utf-8
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
619 :mnemonic ?U
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
620 :charset-list '(unicode)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
621 :post-read-conversion 'ucs-normalize-hfs-nfd-post-read-conversion
104328
4d1464dfdc96 (ucs-normalize-version): Changed to 1.1.
Kenichi Handa <handa@m17n.org>
parents: 104269
diff changeset
622 :pre-write-conversion 'ucs-normalize-hfs-nfd-pre-write-conversion
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
623 )
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
624
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
625 (provide 'ucs-normalize)
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
626
104269
b086ab13a67a Add a `coding' file variable.
Eli Zaretskii <eliz@gnu.org>
parents: 104264
diff changeset
627 ;; Local Variables:
b086ab13a67a Add a `coding' file variable.
Eli Zaretskii <eliz@gnu.org>
parents: 104264
diff changeset
628 ;; coding: utf-8
b086ab13a67a Add a `coding' file variable.
Eli Zaretskii <eliz@gnu.org>
parents: 104264
diff changeset
629 ;; End:
b086ab13a67a Add a `coding' file variable.
Eli Zaretskii <eliz@gnu.org>
parents: 104264
diff changeset
630
104264
be5412b66c92 Add arch tagline
Miles Bader <miles@gnu.org>
parents: 104250
diff changeset
631 ;; arch-tag: cef65ae7-71ad-4e19-8da8-56ab4d42aaa4
104250
a6cdfcf4b769 New file.
Kenichi Handa <handa@m17n.org>
parents:
diff changeset
632 ;;; ucs-normalize.el ends here