Mercurial > emacs
comparison etc/=TO-DO @ 2306:59c8668f70c7
Merged in CHARACTERS
author | Eric S. Raymond <esr@snark.thyrsus.com> |
---|---|
date | Mon, 22 Mar 1993 03:00:23 +0000 |
parents | 216f86e5891d |
children |
comparison
equal
deleted
inserted
replaced
2305:784262b28079 | 2306:59c8668f70c7 |
---|---|
22 a character by character basis. Then make non-full-screen-width | 22 a character by character basis. Then make non-full-screen-width |
23 mode lines inverse video, and display the marked location in | 23 mode lines inverse video, and display the marked location in |
24 inverse video. | 24 inverse video. |
25 | 25 |
26 * VMS code to list a file directory. Make dired work. | 26 * VMS code to list a file directory. Make dired work. |
27 | |
28 Long range: | |
29 | |
30 Ideas for extending GNU Emacs to deal with arbitrary character sets. | |
31 | |
32 I would like GNU Emacs to be extended to handle all the world's alphabets | |
33 and word signs. I don't expect to have time to do such a thing in the next | |
34 few years, so here are my ideas on the best way to do it. | |
35 | |
36 * Each graphic is represented by a sequence of ordinary 8-bit characters. | |
37 | |
38 * All the characters that make up such a sequence have codes >= 0200. | |
39 | |
40 * The first character of such a sequence is between 0200 and 0237. | |
41 | |
42 * The remaining characters of such a sequence are all 0240 or higher. | |
43 | |
44 * The first character of the sequence determines the number of characters | |
45 in the sequence. Thus, 0200...0207 could start two-character sequences, | |
46 0210...0227 could start three-character sequences, and 0230 could start | |
47 four-character sequences. (Codes 0231...0237 would be reserved.) | |
48 | |
49 * Several common alphabets, and some mathematical symbols, would get | |
50 two-character sequences. (Probably Greek, Russian, Hebrew(?), Arabic(?), | |
51 Korean, and Japanese kana). The remaining alphabets, and some versions of | |
52 Chinese, would get three-character sequences. Other sets of Chinese | |
53 characters would get four-character sequences. | |
54 | |
55 Each country that uses Chinese characters has its own standard character | |
56 set, and it is not easy to correlate them to avoid overlap. So there may | |
57 need to be several sets of Chinese characters. That is why they need so | |
58 much code space. | |
59 | |
60 True support for Hebrew and Arabic requires dealing with the problem of | |
61 writing direction for mixed text; I don't know what to do for that. | |
62 | |
63 * The functions that use syntax table would determine the | |
64 syntax of a sequence from its first character. | |
65 | |
66 * Functions in indent.c for computing widths and columns would | |
67 determine the width of a sequence from its first character. | |
68 So would display routines. | |
69 | |
70 * Only a few other editing routines would need any change. In | |
71 particular, searching and regexp matching might not need any change. | |
72 | |
73 * Most of the work required would be in redisplay. The only case that | |
74 needs to be supported is with X windows, since ordinary terminals | |
75 can't display all these characters anyway. | |
76 | |
77 * There might need to be code to translate files from this format | |
78 to whatever format is typically stored on disk. | |
79 | |
80 | |
81 I would be very unhappy with half-measures, such as support for | |
82 Japanese only. | |
83 |