Mercurial > emacs
annotate README.unicode @ 91921:2e27479c19fe
Reverted previous erroneous change.
| author | Bastien Guerry <bzg@altern.org> |
|---|---|
| date | Sun, 17 Feb 2008 23:31:06 +0000 |
| parents | d9c3dce41f29 |
| children |
| rev | line source |
|---|---|
| 89496 | 1 -*-mode: text; coding: latin-1;-*- |
| 2 | |
|
91564
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
3 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
4 Free Software Foundation, Inc. |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
5 See the end of the file for license conditions. |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
6 |
| 91565 | 7 Problems, fixmes and other unicode-related issues |
| 89496 | 8 ------------------------------------------------------------- |
| 9 | |
| 10 Notes by fx to record various things of variable importance. handa | |
| 11 needs to check them -- don't take too seriously, especially with | |
| 12 regard to completeness. | |
| 13 | |
| 14 * SINGLE_BYTE_CHAR_P returns true for Latin-1 characters, which has | |
| 15 undesirable effects. E.g.: | |
| 16 (multibyte-string-p (let ((s "x")) (aset s 0 ?£) s)) => nil | |
| 17 (multibyte-string-p (concat [?£])) => nil | |
| 18 (text-char-description ?£) => "M-#" | |
| 19 | |
| 20 These examples are all fixed by the change of 2002-10-14, but | |
| 91827 | 21 there still exist questionable SINGLE_BYTE_CHAR_P in the |
| 89837 | 22 code (keymap.c and print.c). |
| 89496 | 23 |
| 24 * Rationalize character syntax and its relationship to the Unicode | |
| 25 database. (Applies mainly to symbol an punctuation syntax.) | |
| 26 | |
| 27 * Fontset handling and customization needs work. We want to relate | |
| 28 fonts to scripts, probably based on the Unicode blocks. The | |
| 29 presence of small-repertoire 10646-encoded fonts in XFree 4 is a | |
| 30 pain, not currently worked round. | |
| 31 | |
| 32 With the change on 2002-07-26, multiple fonts can be | |
| 33 specified in a fontset for a specific range of characters. | |
| 34 Each range can also be specified by script. Before using | |
| 35 ISO10646 fonts, Emacs checks their repertories to avoid such | |
| 36 fonts that don't have a glyph for a specific character. | |
| 37 | |
| 89525 | 38 fx has worked on fontset customization, but was stymied by |
| 39 basic problems with the way the default face is dealt with | |
| 40 (and something else, I think). This needs revisiting. | |
| 41 | |
| 89496 | 42 * Work is also needed on charset and coding system priorities. |
| 43 | |
| 44 * The relevant bits of latin1-disp.el need porting (and probably | |
| 45 re-naming/updating). See also cyril-util.el. | |
| 46 | |
| 89525 | 47 * Quail files need more work now the encoding is largely irrelevant. |
| 89496 | 48 |
| 49 * What to do with the old coding categories stuff? | |
| 50 | |
| 51 * The preferred-coding-system property of charsets should probably be | |
| 52 junked unless it can be made more useful now. | |
| 53 | |
| 54 * find-multibyte-characters needs looking at. | |
| 55 | |
| 56 * Implement Korean cp949/UHC, BIG5-HKSCS and any other important missing | |
| 57 charsets. | |
| 58 | |
| 59 * Lazy-load tables for unify-charset somehow? | |
| 60 | |
| 91827 | 61 Actually, Emacs clears out all charset maps and unify-map just |
| 62 before dumping, and they are loaded again on demand by the | |
| 89496 | 63 dumped emacs. But, those maps (char tables) generated while |
| 91827 | 64 temacs is running can't be removed from the dumped emacs. |
| 89496 | 65 |
| 66 * Translation tables for {en,de}code currently aren't supported. | |
| 67 | |
| 68 This should be fixed by the changes of 2002-10-14. | |
| 69 | |
| 70 * Defining CCL coding systems currently doesn't work. | |
| 71 | |
| 72 This should be fixed by the changes of 2003-01-30. | |
| 73 | |
| 74 * iso-2022 charsets get unified on i/o. | |
| 75 | |
| 76 With the change on 2003-01-06, decoding routines put `charset' | |
| 77 property to decoded text, and iso-2022 encoder pay attention | |
| 78 to it. Thus, for instance, reading and writing by | |
| 79 iso-2022-7bit preserve the original designation sequences. | |
| 80 The property name `preferred-charset' may be better? | |
| 81 | |
| 82 We may have to utilize this property to decide a font. | |
| 83 | |
| 84 * Revisit locale processing: look at treating the language and | |
| 85 charset parts separately. (Language should affect things like | |
| 91827 | 86 spelling and calendar, but that's not a Unicode issue.) |
| 89496 | 87 |
| 88 * Handle Unicode combining characters usefully, e.g. diacritics, and | |
| 89 handle more scripts specifically (à la Devanagari). There are | |
| 90 issues with canonicalization. | |
| 91 | |
| 92 * Bidi is a separate issue with no support currently. | |
| 93 | |
| 94 * We need tabular input methods, e.g. for maths symbols. (Not | |
| 95 specific to Unicode.) | |
| 96 | |
| 97 * Need multibyte text in menus, e.g. for the above. (Not specific to | |
| 89525 | 98 Unicode -- see Emacs etc/TODO, but now mostly works with gtk.) |
| 89496 | 99 |
| 100 * There's currently no support for Unicode normalization. | |
| 101 | |
| 91827 | 102 * Populate char-width-table correctly for Unicode characters and |
| 89496 | 103 worry about what happens when double-width charsets covering |
| 104 non-CJK characters are unified. | |
| 105 | |
| 106 * Emacs 20/21 .elc files are currently not loadable. It may or may | |
| 107 not be possible to do this properly. | |
| 108 | |
| 109 With the change on 2002-07-24, elc files generated by Emacs | |
| 110 20.3 and later are correctly loaded (including those | |
| 111 containing multibyte characters and compressed). But, elc | |
| 112 files generated by 20.2 and the primer are still not loadable. | |
| 113 Is it really worth working on it? | |
| 114 | |
| 115 * Rmail won't work with non-ASCII text. Encoding issues for Babyl | |
| 116 files need sorting out, but rms says Babyl will go before this is | |
| 117 released. | |
| 118 | |
| 119 * Gnus still needs some attention, and we need to get changes | |
| 120 accepted by Gnus maintainers... | |
| 121 | |
| 122 * There are type errors lurking, e.g. in | |
| 123 Fcheck_coding_systems_region. Define ENABLE_CHECKING to find them. | |
| 124 | |
| 125 * You can grep the code for lots of fixmes. | |
| 126 | |
| 127 * Old auto-save files, and similar files, such as Gnus drafts, | |
| 128 containing non-ASCII characters probably won't be re-read correctly. | |
| 90424 | 129 |
| 130 | |
| 131 | |
| 132 New font handling mechanism with font backend method | |
| 133 ---------------------------------------------------- | |
| 134 | |
|
91564
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
135 Emacs now contains new codes for handling fonts by multiple font |
| 90424 | 136 backends. The old font handling codes still exist completely parallel |
| 137 to the new codes, and the new codes are used only when you configure | |
| 90927 | 138 Emacs with the argument "--enable-font-backend". |
| 90424 | 139 |
| 90611 | 140 Which font backends to use can be specified by X resource |
| 141 "FontBackend". For instance, if you want to use Xft fonts only, | |
| 142 | |
| 143 Emacs.FontBackend: xft | |
| 144 | |
| 145 will work. If this resource is not set, Emacs tries to use all font | |
| 146 backends available on your graphic device. | |
| 147 | |
| 90424 | 148 The configure script, if invoked with "--enable-font-backend", checks |
| 90927 | 149 if libraries freetype and fontconfig exist. If they are both |
| 90598 | 150 available, macro "USE_FONT_BACKEND" is defined in src/config.h. In |
| 151 that case, the existing of Xft library is checked too. | |
| 90424 | 152 |
| 153 The new files are: | |
| 90597 | 154 font.h -- header providing font-backend related structures |
| 155 (most important ones are "struct font" and "struct | |
| 156 font_driver"), macros, and etc. | |
| 90424 | 157 font.c -- main font handling code. |
| 158 xfont.c -- font-driver on X for X core fonts. | |
| 90597 | 159 ftfont.c -- generic font-driver for FreeType fonts providing |
| 160 device-independent methods of struct font_driver. | |
| 161 xftfont.c -- font-driver on X using Xft for FreeType fonts | |
| 162 utilizing methods provided by ftfont.c. | |
| 163 ftxfont.c -- font-driver on X directly using FreeType fonts | |
| 164 utilizing methods provided by ftfont.c. | |
| 90912 | 165 w32font.c -- font driver on w32 using Windows native fonts, |
| 166 corresponding to xfont.c | |
| 90424 | 167 |
| 90597 | 168 So we already have codes for X. For the other systems (w32 and mac), |
| 90424 | 169 it seems that we need these files: |
| 90597 | 170 atmfont.c -- font-driver on mac using ATM fonts, corresponding |
| 171 to xfont.c | |
| 172 As BDF fonts are currently used on w32, we may also implement these: | |
| 173 bdffont.c -- generic font-driver for BDF fonts, corresponding to | |
| 174 ftfont.c | |
| 175 bdfw32font.c -- font-driver on w32 using BDF fonts, | |
| 176 corresponding to ftxfont.c | |
| 177 But, as FreeType already supports BDF fonts, if FreeType and | |
| 178 Fontconfig are also available on w32, what we need may be: | |
| 179 ftw32font.c -- font-driver on w32 directly using FreeType fonts | |
| 180 utilizing methods provided by ftfont.c. | |
| 90424 | 181 |
| 90912 | 182 And, for those to work, macterm.c and macfns.c must be changed by the |
| 183 similar way as xterm.c and xfns.c (the parts "#ifdef USE_FONT_BACKEND" | |
| 184 ... "#endif" should be checked). | |
| 90597 | 185 |
| 186 It may be interesting if Emacs supports a frame buffer directly and | |
| 187 have these font driver. | |
| 90424 | 188 ftfbfont.c -- font-driver on FB for FreeType fonts. |
| 189 bdffbfont.c -- font-driver on FB for BDF fonts. | |
| 90705 | 190 |
| 91827 | 191 Note: The fontset related codes are not yet matured to work well with |
| 90705 | 192 the font backend method. So, for instance, even if you start Emacs |
| 193 as something like this: | |
| 90927 | 194 % emacs -fn tahoma |
| 90706 | 195 Non-ASCII Latin characters will not be displayed by the font "tahoma". |
| 196 In such a case, please try this: | |
| 90705 | 197 |
| 198 (set-fontset-font "fontset-default" 'latin '("tahoma" . "unicode-bmp")) | |
|
91564
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
199 |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
200 |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
201 This file is part of GNU Emacs. |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
202 |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
203 GNU Emacs is free software; you can redistribute it and/or modify |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
204 it under the terms of the GNU General Public License as published by |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
205 the Free Software Foundation; either version 3, or (at your option) |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
206 any later version. |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
207 |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
208 GNU Emacs is distributed in the hope that it will be useful, |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
209 but WITHOUT ANY WARRANTY; without even the implied warranty of |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
210 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
211 GNU General Public License for more details. |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
212 |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
213 You should have received a copy of the GNU General Public License |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
214 along with GNU Emacs; see the file COPYING. If not, write to the |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
215 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, |
|
9ee03576e1b0
Remove out-of-date comments that assume this is on a branch.
Glenn Morris <rgm@gnu.org>
parents:
90927
diff
changeset
|
216 Boston, MA 02110-1301, USA. |
