Mercurial > freewnn
view cWnn/manual/chap6 @ 17:a3fdd8ad75dc
2ch dictionary has been added
author | Yoshiki Yazawa <yaz@cc.rim.or.jp> |
---|---|
date | Mon, 14 Apr 2008 17:01:14 +0900 |
parents | bbc77ca4def5 |
children |
line wrap: on
line source
©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ £Ã£è£á£ð£ô£å£ò £¶ £Ã£Ï£Í£Í£Á£Î£Ä£Ó ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ ©³©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.1 OVERVIEW ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¿ This chapter gives a description of the cWnn commands and other utilities available. cWnn Commands: ©¥©¥©¥©¥©¥©¥©¥©¥ 1. cserver - Startup of the server 2. cuum - Startup of the front-end processor 3. cwnnstat - Current usage status of the server 4. cwnnkill - Termination of the server 5. cwnntouch - Rewrite of dictionary header cWnn Utilities: ©¥©¥©¥©¥©¥©¥©¥©¥ 5. catod - Creation of the binary form of a dictionary 6. cdtoa - Restoration of a dictionary to its text form 7. catof - Creation of the binary form of a grammar file 8. cwdreg - Registration of characters/words into a specified dictionary 9. cwddel - Deletion of characters/words from a specified dictionary 10. cdicsort - Sorting of a specified text format dictionary - 6-1 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.2 STARTUP OF SERVER - cserver ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cserver - To start the Chinese server. * Default Path /usr/local/bin/cWnn4/cserver * Command Format cserver [-f <file>] [-s <file>] * Function During Chinese input, cserver provides the services (such as conversion services etc.) and resources (such as dictionaries, grammar files etc.), required by the users (front-end processors). The input environment is provided by the front-end processor(cuum), which sends its request to the cserver, which subsequently performs the service and return the result back to the front-end processor. Refer to the "Client-Server Model" in Section 1.3. Normally, once the system is up, fork() will be executed and the server runs as a background process. The startup of cserver can be set in "/etc/rc" to be executed automatically in the Unix system. When the "cserver" command is executed, all the settings in the initialization file "/usr/local/lib/wnn/zh_CN/cserverrc" will be read and the corresponding initialization operations will be performed. Besides, all resources such as dictionaries and usage frequency files of the clients will be maintained by the cserver. Refer to Section 5.3 for details on "cserverrc". If the user happens to start up cserver after it has already been started, an error message will be given. Refer to Section 2.3. * Function Options -f <file> <file> is the specified initialization file for the server. If this option is not specified, the default initialization file "/usr/local/bin/cWnn4/cserverrc" will be read. -s <file> <file> is the specified logfile of cserver. All error messages will be directed to this specified logfile. When <file> is specified as "-", (eg. cserver -s -), the error messages will be sent to the standard error output. - 6-2 - * Note The command options inside [ ] shown in the Command Format indicates that they are optional. If they are not required, "cserver" alone is sufficient to start up the Chinese server. * Related files /tmp/cd_sockV4 * Reference cserverrc(5.3) - 6-3 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.3 STARTUP OF FRONT-END PROCESSOR - cuum ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cuum - To start the front-end processor (Refer to Section 3.2 for details in input environment) * Default Path /usr/local/bin/cWnn4/cuum * Command Format cuum [-h/H] [-x/X] [-k <uumkey_file>] [-c <convert_key_file>] [-r <rk_file>] [-D <hostname>] [-n <username>] [-l <number_of_lines_for_conversion_input>] There are two more input environments: (1) Pinyin centred input enivornment (2) Zhuyin centred input environment These two environments can be started up by using the "-r" option of the "cuum" command together with the default path of each input environment. The "-r" option will be explained below. You may refer to Section 3.2 for examples on starting up these two environments. * Function Once "cuum" is executed, the initialization file for the front-end processor will be read from ONE of the following paths, in decreasing order: (1) The file specified by the UUMRC C-Shell environment variable. (2) @HOME/.uumrc (3) /usr/local/lib/wnn/zh_CN/uumrc (default path) After the initialization file is read, all the initializing operations set in the file will be performed. Refer to Sections 5.4 (uumrc) and 5.5 (wnnenvrc) for the initialization details. The communication between the front-end processor(cuum) and the server(cserver) is via the socket. Refer to Section 1.3 for details. If the user happens to start up cuum after it has already been started, an error message will be given. Refer to Section 2.3. - 6-4 - * Function Options -H The input mode will be set to ON after the startup of cuum. -h The input mode will be set to OFF after the startup of cuum (default). -X During cuum startup, the flow control of tty is ON (default). -x During cuum startup, the flow control of tty is OFF. -k <uumkey_file> Specify the keyboard definition file. If this option is not specified, ONE of the following definition files will be read by default, in decreasing order: (1) The filename indicated by "setuumkey" in the initialization file "uumrc". Refer to Sections 5.3(uumrc) and 5.6(uumkey). (2) /usr/local/lib/wnn/zh_CN/uumkey -c <convert_key_file> Specify the conversion file for the keyboard input codes. If this option is not specified, ONE of the following files will be read by default, in decreasing order: (1) The filename indicated by "setconvkey" in the initialization file "uumrc" . Refer to Sections 5.3(uumrc) and 5.7(cvt_key_tbl). (2) /usr/local/lib/wnn/cvt_key_tbl -r <rk_file> Specify the input mode definition file of the input automaton. If this option is not specified, ONE of the following files will be read by default, in decreasing order: (1) The filename indicated by "setautofile" in the initialization file "uumrc". Refer to Section 5.3(uumrc) and Section 7.5. (2) /usr/local/lib/wnn/zh_CN/rk/mode -l <lines> Specify the number of lines for input at the front-end processor. (0 < lines < window_line-1). The maximum input line is the window line minus one. The default line number is 1. -D <hostname> Specify the server at another host indicated by <hostname>. In this case, each user environment may be set via "setenv" in the front-end processor initialization file, for example "uumrc". If no <hostname> is specified, the one specified by the environment variable CSERVER will be used. - 6-5 - -n <username> Specify the username for the front-end processor. If the <username> is not specified, the value in environment variable UUMUSER will be taken as the default value. If UUMUSER is not defined, the username of the current front-end processor will be taken. * Note - The command options inside [ ] shown in the Command Format indicates that they are optional. If they are not required, "cuum" alone is sufficient to start up the front-end processor. - During the startup of cuum, a "pty" is required from the operating system. If "pty" is unable to be obtained, cuum startup will fail. Similarly, if the initialization file, input automaton definition files and keyboard definition file cannot be read in, cuum startup will also fail. * Reference cserver(6.2) uumrc(5.3) wnnenvrc(5.4) uumkey(5.6) cvt_key_tbl(5.7) - 6-6 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.4 CURRENT STATUS REPORT OF SERVER - cwnnstat ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cwnnstat - To show the current status of cserver. * Default Path /usr/local/bin/cWnn4/cwnnstat * Command Format cwnnstat [-w] [-e] [-E] [-f] [-F] [-d] [-D] [-L <lang>] [<hostname>] * Function To request for the current execution status of the cserver for the current host. If <hostname> is specified, the status of the that specified host will be given. With the function options as shown below, different information may be obtained. * Function Options -w To list the username, hostname, socket number and the environment number. -e To list the environment number, environment name and reference count. -E To list the environment number, environment name, reference count, grammar file number, number of dictionary used, (list of dictionary numbers) and the numbers of the files used in the current environment. -f To list the file ID of each cWnn file in the cserver, the file type, the location of file, reference count and the filename. -F Same as -f option -d To list the dictionary number of dictionaries managed by the host, the dictionary type, dictionary file number, dictionary filename, usage frequency filename and usage frequency file number. -D To list the dictionary number, type, conversion method, number of entries, static/dynamic, current usage status, priority, alias, filename, [(alias:usage frequency filename)], [password (frequency password)] of the dictionaries. -L To specify the language name which is referred during selection of cserver. If no <lang> is specified, the one specified by the environment variable LANG will be used. The default is "zh_CN". - 6-7 - * Note - The command options inside [ ] shown in the Command Format indicates that they are optional. If they are not required, "cwnnstat" alone is sufficient to obtain the status of cserver. - Dictionary number is different from file number. * File number refers to the standardized number among all cWnn files. * Dictionary number refers to the logical dictionary number in the server. - One dictionary file may consist of different usage frequency files, and each will form an individual dictionary. - One dictionary file with different conversion methods (forward/reverse) will form different dictionaries. - 6-8 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.5 SERVER TERMINATION - cwnnkill ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cwnnkill - To terminate the cserver. * Default Path /usr/local/bin/cWnn4/cwnnkill * Command Format cwnnkill [-L <lang>] [<hostname>] * Function To terminate the cserver. If <hostname> is given, the cserver at the specified host will be terminated. If no <hostname> is given, "cwnnkill" will terminate the cserver of UNIX_domain. This is the cserver specified in the environment variable CSERVER. If this environment variable is not specified, "cwnnkill" will terminate the cserver of the local machine. If some other front-end processors are still using the cserver to be killed, the current usage condition of the cserver will be shown, and termination will fail. Refer to Section 2.3. * Function Options -L To specify the language name which is referred during selection of cserver. If no <lang> is specified, the one specified by the environment variable LANG will be used. The default is "zh_CN". * Note The hostname inside [ ] shown in the Command Format indicates that it is optional. If this option is not required, "cwnnkill" alone is sufficient to terminate the cserver. - 6-9 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.6 DICTIONARY HEADER UPDATE - cwnntouch ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cwnntouch - To rewrite the header of the dictionary * Default Path /usr/local/bin/cWnn4/cwnntouch * Command Format cwnntouch <dict_filename> * Function To rewrite the header of the specified dictionary. This is for matching the inode of the dictionary with the inode of the file. <dict_filename> is the filename of the binary format dictionary. - 6-10 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.7 CREATION OF DICTIONARY - catod ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description catod - To convert the text format of a dictionary to binary format. * Default Path /usr/local/bin/cWnn4/catod * Command Format catod [-s <max_word>] [-R] [-r] [-e] [-S] [-U] [-P <dic_passwd>] [-p <fre_passwd>] [-h <cixing_file>] <out_filename> * Function This command converts a dictionary from text format into binary format. <out_filename> is the name of the binary format dictionary. If <out_filename> is not given, the output will be passed to the standard output device(stdout). The input file may be piped in by using the "<" command. For example, ©°©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©´ ©¦ catod basic.dic < basic.u ©¦ ©¸©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¼ "basic.dic" here is the output binary format dictionary, while the "basic.u" is the input text format dictionary. If the input text dictionary is not given, the input will be taken from the standard input(stdin). To end the input via standard input, press ^D. * Function Options -s <max_word> To specify the maximum number of words. Default is 70000. -R To create a dictionary for both forward and reverse conversion. (Default). -r To create a reverse format dictionary only for reverse conversion. -e If the Hanzi inside the text dictionary contains characters such as space and tab, they will be compacted to special format. (Default). -S To create a static dictionary. -U To create a dynamic dictionary. - 6-11 - -P <dic_passwd> To specify the password for the dictionary. If "-N" is used instead, the password of the dictionary will be set to "*". -p <fre_passwd> To specify the password for the usage frequency file. If "-n" is used instead, the password of the frequency file will be set to "*". -h <cixing_file> To specify the Cixing definition file. * Note - The parts in [ ] are options. They may be omitted. - The Pinyin and Zhuyin dictionary has the same format. - For details on the structure of a dictionary file, refer to Section 8.2. * Reference cdtoa(4.7) - 6-12 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.8 CONVERSION OF BINARY DICTIONARY TO TEXT FORMAT - cdtoa ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cdtoa - To convert the binary format of a dictionary back to text format. * Default Path /usr/local/bin/cWnn4/cdtoa * Command Format cdtoa [-n] [-s] [-z] [-e] [-E] <in_filename> [-h <cixing_file>] [<usage_frequency_file>....] * Function To convert the binary format of the dictionary to text format, and output to standard output (stdout). <in_filename> is the name of the input binary format dictionary. The output may be piped into a file by using the ">" command. For example, ©°©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©´ ©¦ cdtoa dict.dic > dict.u ©¦ ©¸©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¼ "dict.u" here is the output text format dictionary, while the "dict.dic" is the input binary format dictionary. [< usage_frequency_file>....] may indicate more than one user usage frequency files (for a particular user). These usage frequency information will be reflected in the text format dictionary created. * Function Options -s To order the entries in text dictionary according to Pinyin or Zhuyin. -n To attach sequence numbers to the output. -z To convert the binary format back to text format in Zhuyin. (Note: default is Pinyin) -e If the Hanzi inside the text dictionary contains characters such as space and tab, they will be compacted to special format. (Default) -E If the Hanzi inside the text dictionary contains characters such as space and tab, they will NOT be compacted to special format. -h <cixing_file> To specify the Cixing definition file. - 6-13 - * Note - The parts in [ ] are options. They may be omitted. - The Pinyin and Zhuyin dictionary has the same format. - The default conversion result of the text dictionary is in Pinyin. * Reference catod(4.6) - 6-14 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.9 CREATION OF USAGE FREQUENCY FILE - catof ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description catof - To convert the text format of grammatical rules into binary format. * Default Path /usr/local/bin/cWnn4/catof * Command Format catof [-h <cixing_file>] <out_grammar_file> * Function This command converts a text file of grammatical rules into binary format. <out_grammar_file> is the name of the output grammar file. If it is not specified, the grammatical rules will be passed to the standard output (stdout). * Function Options -h <cixing_file> To specify the Cixing definition file. * Note - The parts in [ ] are options. They may be omitted. - For details on the structure of grammar file, refer to Section 8.4. - 6-15 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.10 REGISTRATION OF CHARACTERS/WORDS INTO A DICTIONARY - cwdreg ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cwdreg - To register characters/words into the binary format dictionary. * Default Path /usr/local/bin/cWnn4/cwdreg * Command Format cwdreg [-D <server_name>] -n <env_name> -d <dic_no> < <text_dict> OR cwdreg [-D <server_name>] -n <env_name> -L <filename> < <text_dict> * Function This function allows user to register characters/words into the specified binary dictionary, with either dictionary number <dic_no> or dictionary filename <filename> specified. <server_name> is the machine name of the server. If this is not specified, the default cserver indicated by the the environment variable CSERVER will be taken. "-n <env_name>" must be specified. <env_name> is the environment name. You may execute "cwnnstat -E" to see the current environment name. Either "-d <dic_no>" or "-L <filename>" must be specified. <dic_no> is the dictionary number. <filename> is the filename of the dictionary. "-L" is used for when the dictionary is from the local machine. "<" means to pipe the <text_dict> as an input to "cwdreg" command. <text_dict> is the text file which user enters the characters/words to be registered. The format of this text file must be the same as that in the system text format dictionary. That is, ©°©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©´ ©¦ Pinyin Word Cixing Frequency ©¦ ©¦ : : : : ©¦ ©¸©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¼ Refer to Section 8.2 for details. By using "cwdreg", all the characters/words in <text_dict> will be registered into the specified binary dictionary permanently. * Note - The parts in [ ] are options. They may be omitted. - 6-16 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.11 DELETION OF CHARACTERS/WORDS INTO A DICTIONARY - cwddel ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cwddel - To delete characters/words from the binary format dictionary. * Default Path /usr/local/bin/cWnn4/cwddel * Command Format cwddel [<server_name>] <env_name> <dic_no> < <text_dict> * Function This function allows user to delete characters/words from the specified binary dictionary, with the dictionary number <dic_no> specified. <server_name> is the machine name of the server. If this is not specified, the default cserver indicated by the the environment variable CSERVER will be taken. <env_name> must be specified. <env_name> is the environment name. You may execute "cwnnstat -E" to see the current environment name. <dic_no> also must be specified. <dic_no> is the dictionary number. "<" means to pipe the <text_dict> as an input to "cwddel" command. <text_dict> is the user text file which contains the serial number of the characters/words to be deleted from the binary dictionary. The serial number of the characters/words can be found by using environment operation "Word/character search ¼ìË÷" as described in Section 5.2. By using "cwddel", all the characters/words with serial number specified in <text_dict> will be deleted from the specified binary dictionary permanently. * Note - The parts in [ ] are options. They may be omitted. - 6-17 - ©³©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©· ©§ 6.12 DICTIONARY SORT - cdicsort ©§ ©»©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¥©¿ * Description cdicsort - To sort the text format dictionary * Default Path /usr/local/bin/cWnn4/cdicsort * Command Format cdicsort < <text_dict> * Function This function allows user to sort the specified text format dictionary, which is piped in as <text_dict>. If the input text dictionary is not given, the input will be taken from the standard input(stdin). To end the input via standard input, press ^D. "<" means to pipe the <text_dict> as an input to "cdicsort" command. <text_dict> is the text format dictionary. The format of this text file must be the same as that in the system text format dictionary. That is, ©°©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©´ ©¦ Pinyin Word Cixing Frequency ©¦ ©¦ : : : : ©¦ ©¸©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¼ Refer to Section 8.2 for details. By using "cdicsort", all the Pinyin tuples in <text_dict> will be sorted. The result of "cdicsort" will be output to standard output (stdout). It may also be piped into a file by using the ">" command. For example, ©°©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©´ ©¦ cdictsort < dict.u > sort_dict.u ©¦ ©¸©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¤©¼ "dict.u" here is the input text format dictionary, while the "sort_dict.u" is the sorted text format dictionary. - 6-18 -