Mercurial > emacs
annotate lisp/language/hebrew.el @ 111307:707be8bc83af
merge trunk
author | Kenichi Handa <handa@m17n.org> |
---|---|
date | Mon, 01 Nov 2010 16:53:08 +0900 |
parents | 4d54e23aa31e |
children | 417b1e4d63cd |
rev | line source |
---|---|
109354 | 1 ;;; hebrew.el --- support for Hebrew -*- coding: utf-8 -*- |
17052 | 2 |
106815 | 3 ;; Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 |
74544 | 4 ;; Free Software Foundation, Inc. |
74605
6ee41fdd69ff
Update AIST copyright years.
Kenichi Handa <handa@m17n.org>
parents:
74544
diff
changeset
|
5 ;; Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, |
106815 | 6 ;; 2005, 2006, 2007, 2008, 2009, 2010 |
62396 | 7 ;; National Institute of Advanced Industrial Science and Technology (AIST) |
8 ;; Registration Number H14PRO021 | |
42058 | 9 |
89483 | 10 ;; Copyright (C) 2003 |
11 ;; National Institute of Advanced Industrial Science and Technology (AIST) | |
12 ;; Registration Number H13PRO009 | |
42058 | 13 |
17052 | 14 ;; Keywords: multilingual, Hebrew |
15 | |
16 ;; This file is part of GNU Emacs. | |
17 | |
94665
55b7f25d920a
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
93975
diff
changeset
|
18 ;; GNU Emacs is free software: you can redistribute it and/or modify |
17052 | 19 ;; it under the terms of the GNU General Public License as published by |
94665
55b7f25d920a
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
93975
diff
changeset
|
20 ;; the Free Software Foundation, either version 3 of the License, or |
55b7f25d920a
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
93975
diff
changeset
|
21 ;; (at your option) any later version. |
17052 | 22 |
23 ;; GNU Emacs is distributed in the hope that it will be useful, | |
24 ;; but WITHOUT ANY WARRANTY; without even the implied warranty of | |
25 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
26 ;; GNU General Public License for more details. | |
27 | |
28 ;; You should have received a copy of the GNU General Public License | |
94665
55b7f25d920a
Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents:
93975
diff
changeset
|
29 ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. |
17052 | 30 |
31 ;;; Commentary: | |
32 | |
42052 | 33 ;; For Hebrew, the character set ISO8859-8 is supported. |
37112 | 34 ;; See http://www.ecma.ch/ecma1/STAND/ECMA-121.HTM. |
42052 | 35 ;; Windows-1255 is also supported. |
42058 | 36 |
17052 | 37 ;;; Code: |
38 | |
88414
fad0f879877f
Call define-coding-system instead of make-coding-system. All CCL program deleted.
Kenichi Handa <handa@m17n.org>
parents:
42152
diff
changeset
|
39 (define-coding-system 'hebrew-iso-8bit |
fad0f879877f
Call define-coding-system instead of make-coding-system. All CCL program deleted.
Kenichi Handa <handa@m17n.org>
parents:
42152
diff
changeset
|
40 "ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)." |
fad0f879877f
Call define-coding-system instead of make-coding-system. All CCL program deleted.
Kenichi Handa <handa@m17n.org>
parents:
42152
diff
changeset
|
41 :coding-type 'charset |
fad0f879877f
Call define-coding-system instead of make-coding-system. All CCL program deleted.
Kenichi Handa <handa@m17n.org>
parents:
42152
diff
changeset
|
42 :mnemonic ?8 |
fad0f879877f
Call define-coding-system instead of make-coding-system. All CCL program deleted.
Kenichi Handa <handa@m17n.org>
parents:
42152
diff
changeset
|
43 :charset-list '(iso-8859-8) |
88513 | 44 :mime-charset 'iso-8859-8) |
17052 | 45 |
18520
383d11185239
Swap args to define-coding-system-alias.
Richard M. Stallman <rms@gnu.org>
parents:
18377
diff
changeset
|
46 (define-coding-system-alias 'iso-8859-8 'hebrew-iso-8bit) |
18203
0745f30aec66
Adjusted for coding system name change.
Kenichi Handa <handa@m17n.org>
parents:
17993
diff
changeset
|
47 |
29157
f0754aea38e3
(iso-8859-8-e, iso-8859-8-i): For now, just
Eli Zaretskii <eliz@gnu.org>
parents:
28906
diff
changeset
|
48 ;; These are for Explicit and Implicit directionality information, as |
109596
c18e82c381e4
Add a Hebrew translation of the tutorial.
Eli Zaretskii <eliz@gnu.org>
parents:
109354
diff
changeset
|
49 ;; defined in RFC 1556. |
29157
f0754aea38e3
(iso-8859-8-e, iso-8859-8-i): For now, just
Eli Zaretskii <eliz@gnu.org>
parents:
28906
diff
changeset
|
50 (define-coding-system-alias 'iso-8859-8-e 'hebrew-iso-8bit) |
f0754aea38e3
(iso-8859-8-e, iso-8859-8-i): For now, just
Eli Zaretskii <eliz@gnu.org>
parents:
28906
diff
changeset
|
51 (define-coding-system-alias 'iso-8859-8-i 'hebrew-iso-8bit) |
f0754aea38e3
(iso-8859-8-e, iso-8859-8-i): For now, just
Eli Zaretskii <eliz@gnu.org>
parents:
28906
diff
changeset
|
52 |
17052 | 53 (set-language-info-alist |
109596
c18e82c381e4
Add a Hebrew translation of the tutorial.
Eli Zaretskii <eliz@gnu.org>
parents:
109354
diff
changeset
|
54 "Hebrew" '((tutorial . "TUTORIAL.he") |
c18e82c381e4
Add a Hebrew translation of the tutorial.
Eli Zaretskii <eliz@gnu.org>
parents:
109354
diff
changeset
|
55 (charset iso-8859-8) |
22714
e632673d8975
("Hebrew"): Add coding-priority.
Kenichi Handa <handa@m17n.org>
parents:
20741
diff
changeset
|
56 (coding-priority hebrew-iso-8bit) |
88707
42305f57ba27
("Hebrew"): Add windows-1255, cp862 coding
Dave Love <fx@gnu.org>
parents:
88619
diff
changeset
|
57 (coding-system hebrew-iso-8bit windows-1255 cp862) |
88414
fad0f879877f
Call define-coding-system instead of make-coding-system. All CCL program deleted.
Kenichi Handa <handa@m17n.org>
parents:
42152
diff
changeset
|
58 (nonascii-translation . iso-8859-8) |
22982
5fef9d1a7fc2
(setup-XXX-environment): Just call set-language-environment. If
Kenichi Handa <handa@m17n.org>
parents:
22714
diff
changeset
|
59 (input-method . "hebrew") |
5fef9d1a7fc2
(setup-XXX-environment): Just call set-language-environment. If
Kenichi Handa <handa@m17n.org>
parents:
22714
diff
changeset
|
60 (unibyte-display . hebrew-iso-8bit) |
109354 | 61 (sample-text . "Hebrew שלום") |
108598
bd78a6519aa5
lisp/language/hebrew.el ("Hebrew", "Windows-1255"): Doc fix.
Eli Zaretskii <eliz@gnu.org>
parents:
106815
diff
changeset
|
62 (documentation . "Bidirectional editing is supported."))) |
17052 | 63 |
42052 | 64 (set-language-info-alist |
65 "Windows-1255" '((coding-priority windows-1255) | |
66 (coding-system windows-1255) | |
42152 | 67 (documentation . "\ |
68 Support for Windows-1255 encoding, e.g. for Yiddish. | |
108598
bd78a6519aa5
lisp/language/hebrew.el ("Hebrew", "Windows-1255"): Doc fix.
Eli Zaretskii <eliz@gnu.org>
parents:
106815
diff
changeset
|
69 Bidirectional editing is supported."))) |
42052 | 70 |
88558
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
71 (define-coding-system 'windows-1255 |
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
72 "windows-1255 (Hebrew) encoding (MIME: WINDOWS-1255)" |
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
73 :coding-type 'charset |
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
74 :mnemonic ?h |
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
75 :charset-list '(windows-1255) |
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
76 :mime-charset 'windows-1255) |
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
77 (define-coding-system-alias 'cp1255 'windows-1255) |
cb333bb24363
(windows-1255, cp1255): New coding systems.
Dave Love <fx@gnu.org>
parents:
88513
diff
changeset
|
78 |
88619
3157a3f9d92d
(cp862, ibm862): New coding systems.
Dave Love <fx@gnu.org>
parents:
88558
diff
changeset
|
79 (define-coding-system 'cp862 |
88707
42305f57ba27
("Hebrew"): Add windows-1255, cp862 coding
Dave Love <fx@gnu.org>
parents:
88619
diff
changeset
|
80 "DOS codepage 862 (Hebrew)" |
88619
3157a3f9d92d
(cp862, ibm862): New coding systems.
Dave Love <fx@gnu.org>
parents:
88558
diff
changeset
|
81 :coding-type 'charset |
3157a3f9d92d
(cp862, ibm862): New coding systems.
Dave Love <fx@gnu.org>
parents:
88558
diff
changeset
|
82 :mnemonic ?D |
3157a3f9d92d
(cp862, ibm862): New coding systems.
Dave Love <fx@gnu.org>
parents:
88558
diff
changeset
|
83 :charset-list '(cp862) |
3157a3f9d92d
(cp862, ibm862): New coding systems.
Dave Love <fx@gnu.org>
parents:
88558
diff
changeset
|
84 :mime-charset 'cp862) |
3157a3f9d92d
(cp862, ibm862): New coding systems.
Dave Love <fx@gnu.org>
parents:
88558
diff
changeset
|
85 (define-coding-system-alias 'ibm862 'cp862) |
3157a3f9d92d
(cp862, ibm862): New coding systems.
Dave Love <fx@gnu.org>
parents:
88558
diff
changeset
|
86 |
109354 | 87 ;; Return a nested alist of Hebrew character sequences vs the |
88 ;; corresponding glyph of FONT-OBJECT. | |
89 (defun hebrew-font-get-precomposed (font-object) | |
90 (let ((precomposed (font-get font-object 'hebrew-precomposed)) | |
110361
4d54e23aa31e
Fix typos in comments and ChangeLogs.
Juanma Barranquero <lekktu@gmail.com>
parents:
109734
diff
changeset
|
91 ;; Vector of Hebrew precomposed characters. |
109354 | 92 (chars [#xFB2A #xFB2B #xFB2C #xFB2D #xFB2E #xFB2F #xFB30 #xFB31 |
93 #xFB32 #xFB33 #xFB34 #xFB35 #xFB36 #xFB38 #xFB39 #xFB3A | |
94 #xFB3B #xFB3C #xFB3E #xFB40 #xFB41 #xFB43 #xFB44 #xFB46 | |
95 #xFB47 #xFB48 #xFB49 #xFB4A #xFB4B #xFB4C #xFB4D #xFB4E]) | |
96 ;; Vector of decomposition character sequences corresponding | |
97 ;; to the above vector. | |
110361
4d54e23aa31e
Fix typos in comments and ChangeLogs.
Juanma Barranquero <lekktu@gmail.com>
parents:
109734
diff
changeset
|
98 (decomposed |
109354 | 99 [[#x05E9 #x05C1] |
100 [#x05E9 #x05C2] | |
101 [#x05E9 #x05BC #x05C1] | |
102 [#x05E9 #x05BC #x05C2] | |
103 [#x05D0 #x05B7] | |
104 [#x05D0 #x05B8] | |
105 [#x05D0 #x05BC] | |
106 [#x05D1 #x05BC] | |
107 [#x05D2 #x05BC] | |
108 [#x05D3 #x05BC] | |
109 [#x05D4 #x05BC] | |
110 [#x05D5 #x05BC] | |
111 [#x05D6 #x05BC] | |
112 [#x05D8 #x05BC] | |
113 [#x05D9 #x05BC] | |
114 [#x05DA #x05BC] | |
115 [#x05DB #x05BC] | |
116 [#x05DC #x05BC] | |
117 [#x05DE #x05BC] | |
118 [#x05E0 #x05BC] | |
119 [#x05E1 #x05BC] | |
120 [#x05E3 #x05BC] | |
121 [#x05E4 #x05BC] | |
122 [#x05E6 #x05BC] | |
123 [#x05E7 #x05BC] | |
124 [#x05E8 #x05BC] | |
125 [#x05E9 #x05BC] | |
126 [#x05EA #x05BC] | |
127 [#x05D5 #x05B9] | |
128 [#x05D1 #x05BF] | |
129 [#x05DB #x05BF] | |
130 [#x05E4 #x05BF]])) | |
131 (unless precomposed | |
132 (setq precomposed (list t)) | |
133 (let ((gvec (font-get-glyphs font-object 0 (length chars) chars))) | |
134 (dotimes (i (length chars)) | |
135 (if (aref gvec i) | |
136 (set-nested-alist (aref decomposed i) (aref gvec i) | |
137 precomposed)))) | |
138 ;; Cache the result in FONT-OBJECT's property. | |
139 (font-put font-object 'hebrew-precomposed precomposed)) | |
140 precomposed)) | |
141 | |
142 ;; Composition function for hebrew. GSTRING is made of a Hebrew base | |
143 ;; character followed by Hebrew diacritical marks, or is made of | |
144 ;; single Hebrew diacritical mark. Adjust GSTRING to display that | |
145 ;; sequence properly. The basic strategy is: | |
146 ;; | |
147 ;; (1) If there's single diacritical, add padding space to the left | |
148 ;; and right of the glyph. | |
149 ;; | |
150 ;; (2) If the font has OpenType features for Hebrew, ask the OTF | |
151 ;; driver the whole work. | |
152 ;; | |
153 ;; (3) If the font has precomposed glyphs, use them as far as | |
154 ;; possible. Adjust the remaining glyphs artificially. | |
155 | |
108762 | 156 (defun hebrew-shape-gstring (gstring) |
109354 | 157 (let* ((font (lgstring-font gstring)) |
158 (otf (font-get font :otf)) | |
159 (nchars (lgstring-char-len gstring)) | |
160 header nglyphs base-width glyph precomposed val idx) | |
161 (cond | |
162 ((= nchars 1) | |
163 ;; Independent diacritical mark. Add padding space to left or | |
164 ;; right so that the glyph doesn't overlap with the surrounding | |
165 ;; chars. | |
166 (setq glyph (lgstring-glyph gstring 0)) | |
167 (let ((width (lglyph-width glyph)) | |
168 bearing) | |
169 (if (< (setq bearing (lglyph-lbearing glyph)) 0) | |
170 (lglyph-set-adjustment glyph bearing 0 (- width bearing))) | |
171 (if (> (setq bearing (lglyph-rbearing glyph)) width) | |
172 (lglyph-set-adjustment glyph 0 0 bearing)))) | |
173 | |
174 ((or (assq 'hebr (car otf)) (assq 'hebr (cdr otf))) | |
175 ;; FONT has OpenType features for Hebrew. | |
176 (font-shape-gstring gstring)) | |
177 | |
178 (t | |
179 ;; FONT doesn't have OpenType features for Hebrew. | |
180 ;; Try a precomposed glyph. | |
181 ;; Now GSTRING is in this form: | |
182 ;; [[FONT CHAR1 CHAR2 ... CHARn] nil GLYPH1 GLYPH2 ... GLYPHn nil ...] | |
183 (setq precomposed (hebrew-font-get-precomposed font) | |
184 header (lgstring-header gstring) | |
185 val (lookup-nested-alist header precomposed nil 1)) | |
186 (if (and (consp val) (vectorp (car val))) | |
187 ;; All characters can be displayed by a single precomposed glyph. | |
188 ;; Reform GSTRING to [HEADER nil PRECOMPOSED-GLYPH nil ...] | |
189 (let ((glyph (copy-sequence (car val)))) | |
190 (lglyph-set-from-to glyph 0 (1- nchars)) | |
191 (lgstring-set-glyph gstring 0 glyph) | |
192 (lgstring-set-glyph gstring 1 nil)) | |
193 (if (and (integerp val) (> val 2) | |
194 (setq glyph (lookup-nested-alist header precomposed val 1)) | |
195 (consp glyph) (vectorp (car glyph))) | |
196 ;; The first (1- VAL) characters can be displayed by a | |
197 ;; precomposed glyph. Provided that VAL is 3, the first | |
198 ;; two glyphs should be replaced by the precomposed glyph. | |
199 ;; In that case, reform GSTRING to: | |
200 ;; [HEADER nil PRECOMPOSED-GLYPH GLYPH3 ... GLYPHn nil ...] | |
201 (let* ((ncmp (1- val)) ; number of composed glyphs | |
202 (diff (1- ncmp))) ; number of reduced glyphs | |
203 (setq glyph (copy-sequence (car glyph))) | |
204 (lglyph-set-from-to glyph 0 (1- nchars)) | |
205 (lgstring-set-glyph gstring 0 glyph) | |
206 (setq idx ncmp) | |
207 (while (< idx nchars) | |
208 (setq glyph (lgstring-glyph gstring idx)) | |
209 (lglyph-set-from-to glyph 0 (1- nchars)) | |
210 (lgstring-set-glyph gstring (- idx diff) glyph) | |
211 (setq idx (1+ idx))) | |
212 (lgstring-set-glyph gstring (- idx diff) nil) | |
213 (setq idx (- ncmp diff) | |
214 nglyphs (- nchars diff))) | |
215 (setq glyph (lgstring-glyph gstring 0)) | |
216 (lglyph-set-from-to glyph 0 (1- nchars)) | |
217 (setq idx 1 nglyphs nchars)) | |
218 ;; Now IDX is an index to the first non-precomposed glyph. | |
219 ;; Adjust positions of the remaining glyphs artificially. | |
220 (setq base-width (lglyph-width (lgstring-glyph gstring 0))) | |
221 (while (< idx nglyphs) | |
222 (setq glyph (lgstring-glyph gstring idx)) | |
223 (lglyph-set-from-to glyph 0 (1- nchars)) | |
224 (if (>= (lglyph-lbearing glyph) (lglyph-width glyph)) | |
225 ;; It seems that this glyph is designed to be rendered | |
226 ;; before the base glyph. | |
227 (lglyph-set-adjustment glyph (- base-width) 0 0) | |
228 (if (>= (lglyph-lbearing glyph) 0) | |
229 ;; Align the horizontal center of this glyph to the | |
230 ;; horizontal center of the base glyph. | |
231 (let ((width (- (lglyph-rbearing glyph) | |
232 (lglyph-lbearing glyph)))) | |
233 (lglyph-set-adjustment glyph | |
234 (- (/ (- base-width width) 2) | |
235 (lglyph-lbearing glyph) | |
236 base-width) 0 0)))) | |
237 (setq idx (1+ idx)))))) | |
238 gstring)) | |
108762 | 239 |
109734
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
240 (let* ((base "[\u05D0-\u05F2]") |
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
241 (combining "[\u0591-\u05BD\u05BF\u05C1-\u05C2\u05C4-\u05C5\u05C7]+") |
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
242 (pattern1 (concat base combining)) |
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
243 (pattern2 (concat base "\u200D" combining))) |
108762 | 244 (set-char-table-range |
245 composition-function-table '(#x591 . #x5C7) | |
109354 | 246 (list (vector pattern2 3 'hebrew-shape-gstring) |
247 (vector pattern2 2 'hebrew-shape-gstring) | |
108762 | 248 (vector pattern1 1 'hebrew-shape-gstring) |
109354 | 249 [nil 0 hebrew-shape-gstring])) |
109734
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
250 ;; Exclude non-combining characters. |
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
251 (set-char-table-range |
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
252 composition-function-table #x5BE nil) |
108762 | 253 (set-char-table-range |
254 composition-function-table #x5C0 nil) | |
255 (set-char-table-range | |
109734
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
256 composition-function-table #x5C3 nil) |
650ab6e37354
language/hebrew.el: Exclude U+05C3 (Hebrew SOF PASUQ) from the composable pattern.
Kenichi Handa <handa@etlken>
parents:
109725
diff
changeset
|
257 (set-char-table-range |
108762 | 258 composition-function-table #x5C6 nil)) |
259 | |
33778 | 260 (provide 'hebrew) |
261 | |
93975
1e3a407766b9
Fix up comment convention on the arch-tag lines.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
91327
diff
changeset
|
262 ;; arch-tag: 3ca04f32-3f1e-498e-af46-8267498ba5d9 |
18203
0745f30aec66
Adjusted for coding system name change.
Kenichi Handa <handa@m17n.org>
parents:
17993
diff
changeset
|
263 ;;; hebrew.el ends here |