Mercurial > emacs
comparison doc/misc/nxml-mode.texi @ 86378:5d15069189ff
Fixups for nxml per Romain Francoise email.
author | Mark A. Hershberger <mah@everybody.org> |
---|---|
date | Fri, 23 Nov 2007 19:38:49 +0000 |
parents | |
children | 2ac1a9b70580 |
comparison
equal
deleted
inserted
replaced
86377:a6cfd62b2902 | 86378:5d15069189ff |
---|---|
1 \input texinfo @c -*- texinfo -*- | |
2 @c %**start of header | |
3 @setfilename ../../info/nxml-mode | |
4 @settitle nXML Mode | |
5 @c %**end of header | |
6 | |
7 @dircategory Emacs | |
8 @direntry | |
9 * nXML Mode: (nxml-mode.info). | |
10 @end direntry | |
11 | |
12 @node Top | |
13 @top nXML Mode | |
14 | |
15 This manual documents nxml-mode, an Emacs major mode for editing | |
16 XML with RELAX NG support. This manual is not yet complete. | |
17 | |
18 @menu | |
19 * Completion:: | |
20 * Inserting end-tags:: | |
21 * Paragraphs:: | |
22 * Outlining:: | |
23 * Locating a schema:: | |
24 * DTDs:: | |
25 * Limitations:: | |
26 @end menu | |
27 | |
28 @node Completion | |
29 @chapter Completion | |
30 | |
31 Apart from real-time validation, the most important feature that | |
32 nxml-mode provides for assisting in document creation is "completion". | |
33 Completion assists the user in inserting characters at point, based on | |
34 knowledge of the schema and on the contents of the buffer before | |
35 point. | |
36 | |
37 The traditional GNU Emacs key combination for completion in a | |
38 buffer is @kbd{M-@key{TAB}}. However, many window systems | |
39 and window managers use this key combination themselves (typically for | |
40 switching between windows) and do not pass it to applications. It's | |
41 hard to find key combinations in GNU Emacs that are both easy to type | |
42 and not taken by something else. @kbd{C-@key{RET}} (i.e. | |
43 pressing the Enter or Return key, while the Ctrl key is held down) is | |
44 available. It won't be available on a traditional terminal (because | |
45 it is indistinguishable from Return), but it will work with a window | |
46 system. Therefore we adopt the following solution by default: use | |
47 @kbd{C-@key{RET}} when there's a window system and | |
48 @kbd{M-@key{TAB}} when there's not. In the following, I | |
49 will assume that a window system is being used and will therefore | |
50 refer to @kbd{C-@key{RET}}. | |
51 | |
52 Completion works by examining the symbol preceding point. This | |
53 is the symbol to be completed. The symbol to be completed may be the | |
54 empty. Completion considers what symbols starting with the symbol to | |
55 be completed would be valid replacements for the symbol to be | |
56 completed, given the schema and the contents of the buffer before | |
57 point. These symbols are the possible completions. An example may | |
58 make this clearer. Suppose the buffer looks like this (where @point{} | |
59 indicates point): | |
60 | |
61 @example | |
62 <html xmlns="http://www.w3.org/1999/xhtml"> | |
63 <h@point{} | |
64 @end example | |
65 | |
66 @noindent | |
67 and the schema is XHTML. In this context, the symbol to be completed | |
68 is @samp{h}. The possible completions consist of just | |
69 @samp{head}. Another example, is | |
70 | |
71 @example | |
72 <html xmlns="http://www.w3.org/1999/xhtml"> | |
73 <head> | |
74 <@point{} | |
75 @end example | |
76 | |
77 @noindent | |
78 In this case, the symbol to be completed is empty, and the possible | |
79 completions are @samp{base}, @samp{isindex}, | |
80 @samp{link}, @samp{meta}, @samp{script}, | |
81 @samp{style}, @samp{title}. Another example is: | |
82 | |
83 @example | |
84 <html xmlns="@point{} | |
85 @end example | |
86 | |
87 @noindent | |
88 In this case, the symbol to be completed is empty, and the possible | |
89 completions are just @samp{http://www.w3.org/1999/xhtml}. | |
90 | |
91 When you type @kbd{C-@key{RET}}, what happens depends | |
92 on what the set of possible completions are. | |
93 | |
94 @itemize @bullet | |
95 @item | |
96 If the set of completions is empty, nothing | |
97 happens. | |
98 @item | |
99 If there is one possible completion, then that completion is | |
100 inserted, together with any following characters that are | |
101 required. For example, in this case: | |
102 | |
103 @example | |
104 <html xmlns="http://www.w3.org/1999/xhtml"> | |
105 <@point{} | |
106 @end example | |
107 | |
108 @noindent | |
109 @kbd{C-@key{RET}} will yield | |
110 | |
111 @example | |
112 <html xmlns="http://www.w3.org/1999/xhtml"> | |
113 <head@point{} | |
114 @end example | |
115 @item | |
116 If there is more than one possible completion, but all | |
117 possible completions share a common non-empty prefix, then that prefix | |
118 is inserted. For example, suppose the buffer is: | |
119 | |
120 @example | |
121 <html x@point{} | |
122 @end example | |
123 | |
124 @noindent | |
125 The symbol to be completed is @samp{x}. The possible completions | |
126 are @samp{xmlns} and @samp{xml:lang}. These share a | |
127 common prefix of @samp{xml}. Thus, @kbd{C-@key{RET}} | |
128 will yield: | |
129 | |
130 @example | |
131 <html xml@point{} | |
132 @end example | |
133 | |
134 @noindent | |
135 Typically, you would do @kbd{C-@key{RET}} again, which would | |
136 have the result described in the next item. | |
137 @item | |
138 If there is more than one possible completion, but the | |
139 possible completions do not share a non-empty prefix, then Emacs will | |
140 prompt you to input the symbol in the minibuffer, initializing the | |
141 minibuffer with the symbol to be completed, and popping up a buffer | |
142 showing the possible completions. You can now input the symbol to be | |
143 inserted. The symbol you input will be inserted in the buffer instead | |
144 of the symbol to be completed. Emacs will then insert any required | |
145 characters after the symbol. For example, if it contains: | |
146 | |
147 @example | |
148 <html xml@point{} | |
149 @end example | |
150 | |
151 @noindent | |
152 Emacs will prompt you in the minibuffer with | |
153 | |
154 @example | |
155 Attribute: xml@point{} | |
156 @end example | |
157 | |
158 @noindent | |
159 and the buffer showing possible completions will contain | |
160 | |
161 @example | |
162 Possible completions are: | |
163 xml:lang xmlns | |
164 @end example | |
165 | |
166 @noindent | |
167 If you input @kbd{xmlns}, the result will be: | |
168 | |
169 @example | |
170 <html xmlns="@point{} | |
171 @end example | |
172 | |
173 @noindent | |
174 (If you do @kbd{C-@key{RET}} again, the namespace URI will | |
175 be inserted. Should that happen automatically?) | |
176 @end itemize | |
177 | |
178 @node Inserting end-tags | |
179 @chapter Inserting end-tags | |
180 | |
181 The main redundancy in XML syntax is end-tags. nxml-mode provides | |
182 several ways to make it easier to enter end-tags. You can use all of | |
183 these without a schema. | |
184 | |
185 You can use @kbd{C-@key{RET}} after @samp{</} | |
186 to complete the rest of the end-tag. | |
187 | |
188 @kbd{C-c C-f} inserts an end-tag for the element containing | |
189 point. This command is useful when you want to input the start-tag, | |
190 then input the content and finally input the end-tag. The @samp{f} | |
191 is mnemonic for finish. | |
192 | |
193 If you want to keep tags balanced and input the end-tag at the | |
194 same time as the start-tag, before inputting the content, then you can | |
195 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts | |
196 the end-tag and leaves point before the end-tag. @kbd{C-c C-b} | |
197 is similar but more convenient for block-level elements: it puts the | |
198 start-tag, point and the end-tag on successive lines, appropriately | |
199 indented. The @samp{i} is mnemonic for inline and the | |
200 @samp{b} is mnemonic for block. | |
201 | |
202 Finally, you can customize nxml-mode so that @kbd{/} | |
203 automatically inserts the rest of the end-tag when it occurs after | |
204 @samp{<}, by doing | |
205 | |
206 @display | |
207 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}} | |
208 @end display | |
209 | |
210 @noindent | |
211 and then following the instructions in the displayed buffer. | |
212 | |
213 @node Paragraphs | |
214 @chapter Paragraphs | |
215 | |
216 Emacs has several commands that operate on paragraphs, most | |
217 notably @kbd{M-q}. nXML mode redefines these to work in a way | |
218 that is useful for XML. The exact rules that are used to find the | |
219 beginning and end of a paragraph are complicated; they are designed | |
220 mainly to ensure that @kbd{M-q} does the right thing. | |
221 | |
222 A paragraph consists of one or more complete, consecutive lines. | |
223 A group of lines is not considered a paragraph unless it contains some | |
224 non-whitespace characters between tags or inside comments. A blank | |
225 line separates paragraphs. A single tag on a line by itself also | |
226 separates paragraphs. More precisely, if one tag together with any | |
227 leading and trailing whitespace completely occupy one or more lines, | |
228 then those lines will not be included in any paragraph. | |
229 | |
230 A start-tag at the beginning of the line (possibly indented) may | |
231 be treated as starting a paragraph. Similarly, an end-tag at the end | |
232 of the line may be treated as ending a paragraph. The following rules | |
233 are used to determine whether such a tag is in fact treated as a | |
234 paragraph boundary: | |
235 | |
236 @itemize @bullet | |
237 @item | |
238 If the schema does not allow text at that point, then it | |
239 is a paragraph boundary. | |
240 @item | |
241 If the end-tag corresponding to the start-tag is not at | |
242 the end of its line, or the start-tag corresponding to the end-tag is | |
243 not at the beginning of its line, then it is not a paragraph | |
244 boundary. For example, in | |
245 | |
246 @example | |
247 <p>This is a paragraph with an | |
248 <emph>emphasized</emph> phrase. | |
249 @end example | |
250 | |
251 @noindent | |
252 the @samp{<emph>} start-tag would not be considered as | |
253 starting a paragraph, because its corresponding end-tag is not at the | |
254 end of the line. | |
255 @item | |
256 If there is text that is a sibling in element tree, then | |
257 it is not a paragraph boundary. For example, in | |
258 | |
259 @example | |
260 <p>This is a paragraph with an | |
261 <emph>emphasized phrase that takes one source line</emph> | |
262 @end example | |
263 | |
264 @noindent | |
265 the @samp{<emph>} start-tag would not be considered as | |
266 starting a paragraph, even though its end-tag is at the end of its | |
267 line, because there the text @samp{This is a paragraph with an} | |
268 is a sibling of the @samp{emph} element. | |
269 @item | |
270 Otherwise, it is a paragraph boundary. | |
271 @end itemize | |
272 | |
273 @node Outlining | |
274 @chapter Outlining | |
275 | |
276 nXML mode allows you to display all or part of a buffer as an | |
277 outline, in a similar way to Emacs' outline mode. An outline in nXML | |
278 mode is based on recognizing two kinds of element: sections and | |
279 headings. There is one heading for every section and one section for | |
280 every heading. A section contains its heading as or within its first | |
281 child element. A section also contains its subordinate sections (its | |
282 subsections). The text content of a section consists of anything in a | |
283 section that is neither a subsection nor a heading. | |
284 | |
285 Note that this is a different model from that used by XHTML. | |
286 nXML mode's outline support will not be useful for XHTML unless you | |
287 adopt a convention of adding a @code{div} to enclose each | |
288 section, rather than having sections implicitly delimited by different | |
289 @code{h@var{n}} elements. This limitation may be removed | |
290 in a future version. | |
291 | |
292 The variable @code{nxml-section-element-name-regexp} gives | |
293 a regexp for the local names (i.e. the part of the name following any | |
294 prefix) of section elements. The variable | |
295 @code{nxml-heading-element-name-regexp} gives a regexp for the | |
296 local names of heading elements. For an element to be recognized | |
297 as a section | |
298 | |
299 @itemize @bullet | |
300 @item | |
301 its start-tag must occur at the beginning of a line | |
302 (possibly indented); | |
303 @item | |
304 its local name must match | |
305 @code{nxml-section-element-name-regexp}; | |
306 @item | |
307 either its first child element or a descendant of that | |
308 first child element must have a local name that matches | |
309 @code{nxml-heading-element-name-regexp}; the first such element | |
310 is treated as the section's heading. | |
311 @end itemize | |
312 | |
313 @noindent | |
314 You can customize these variables using @kbd{M-x | |
315 customize-variable}. | |
316 | |
317 There are three possible outline states for a section: | |
318 | |
319 @itemize @bullet | |
320 @item | |
321 normal, showing everything, including its heading, text | |
322 content and subsections; each subsection is displayed according to the | |
323 state of that subsection; | |
324 @item | |
325 showing just its heading, with both its text content and | |
326 its subsections hidden; all subsections are hidden regardless of their | |
327 state; | |
328 @item | |
329 showing its heading and its subsections, with its text | |
330 content hidden; each subsection is displayed according to the state of | |
331 that subsection. | |
332 @end itemize | |
333 | |
334 In the last two states, where the text content is hidden, the | |
335 heading is displayed specially, in an abbreviated form. An element | |
336 like this: | |
337 | |
338 @example | |
339 <section> | |
340 <title>Food</title> | |
341 <para>There are many kinds of food.</para> | |
342 </section> | |
343 @end example | |
344 | |
345 @noindent | |
346 would be displayed on a single line like this: | |
347 | |
348 @example | |
349 <-section>Food...</> | |
350 @end example | |
351 | |
352 @noindent | |
353 If there are hidden subsections, then a @code{+} will be used | |
354 instead of a @code{-} like this: | |
355 | |
356 @example | |
357 <+section>Food...</> | |
358 @end example | |
359 | |
360 @noindent | |
361 If there are non-hidden subsections, then the section will instead be | |
362 displayed like this: | |
363 | |
364 @example | |
365 <-section>Food... | |
366 <-section>Delicious Food...</> | |
367 <-section>Distasteful Food...</> | |
368 </-section> | |
369 @end example | |
370 | |
371 @noindent | |
372 The heading is always displayed with an indent that corresponds to its | |
373 depth in the outline, even it is not actually indented in the buffer. | |
374 The variable @code{nxml-outline-child-indent} controls how much | |
375 a subheading is indented with respect to its parent heading when the | |
376 heading is being displayed specially. | |
377 | |
378 Commands to change the outline state of sections are bound to | |
379 key sequences that start with @kbd{C-c C-o} (@kbd{o} is | |
380 mnemonic for outline). The third and final key has been chosen to be | |
381 consistent with outline mode. In the following descriptions | |
382 current section means the section containing point, or, more precisely, | |
383 the innermost section containing the character immediately following | |
384 point. | |
385 | |
386 @itemize @bullet | |
387 @item | |
388 @kbd{C-c C-o C-a} shows all sections in the buffer | |
389 normally. | |
390 @item | |
391 @kbd{C-c C-o C-t} hides the text content | |
392 of all sections in the buffer. | |
393 @item | |
394 @kbd{C-c C-o C-c} hides the text content | |
395 of the current section. | |
396 @item | |
397 @kbd{C-c C-o C-e} shows the text content | |
398 of the current section. | |
399 @item | |
400 @kbd{C-c C-o C-d} hides the text content | |
401 and subsections of the current section. | |
402 @item | |
403 @kbd{C-c C-o C-s} shows the current section | |
404 and all its direct and indirect subsections normally. | |
405 @item | |
406 @kbd{C-c C-o C-k} shows the headings of the | |
407 direct and indirect subsections of the current section. | |
408 @item | |
409 @kbd{C-c C-o C-l} hides the text content of the | |
410 current section and of its direct and indirect | |
411 subsections. | |
412 @item | |
413 @kbd{C-c C-o C-i} shows the headings of the | |
414 direct subsections of the current section. | |
415 @item | |
416 @kbd{C-c C-o C-o} hides as much as possible without | |
417 hiding the current section's text content; the headings of ancestor | |
418 sections of the current section and their child section sections will | |
419 not be hidden. | |
420 @end itemize | |
421 | |
422 When a heading is displayed specially, you can use | |
423 @key{RET} in that heading to show the text content of the section | |
424 in the same way as @kbd{C-c C-o C-e}. | |
425 | |
426 You can also use the mouse to change the outline state: | |
427 @kbd{S-mouse-2} hides the text content of a section in the same | |
428 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially | |
429 displayed heading shows the text content of the section in the same | |
430 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially | |
431 displayed start-tag toggles the display of subheadings on and | |
432 off. | |
433 | |
434 The outline state for each section is stored with the first | |
435 character of the section (as a text property). Every command that | |
436 changes the outline state of any section updates the display of the | |
437 buffer so that each section is displayed correctly according to its | |
438 outline state. If the section structure is subsequently changed, then | |
439 it is possible for the display to no longer correctly reflect the | |
440 stored outline state. @kbd{C-c C-o C-r} can be used to refresh | |
441 the display so it is correct again. | |
442 | |
443 @node Locating a schema | |
444 @chapter Locating a schema | |
445 | |
446 nXML mode has a configurable set of rules to locate a schema for | |
447 the file being edited. The rules are contained in one or more schema | |
448 locating files, which are XML documents. | |
449 | |
450 The variable @samp{rng-schema-locating-files} specifies | |
451 the list of the file-names of schema locating files that nXML mode | |
452 should use. The order of the list is significant: when file | |
453 @var{x} occurs in the list before file @var{y} then rules | |
454 from file @var{x} have precedence over rules from file | |
455 @var{y}. A filename specified in | |
456 @samp{rng-schema-locating-files} may be relative. If so, it will | |
457 be resolved relative to the document for which a schema is being | |
458 located. It is not an error if relative file-names in | |
459 @samp{rng-schema-locating-files} do not not exist. You can use | |
460 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files | |
461 @key{RET}} to customize the list of schema locating | |
462 files. | |
463 | |
464 By default, @samp{rng-schema-locating-files} list has two | |
465 members: @samp{schemas.xml}, and | |
466 @samp{@var{dist-dir}/schema/schemas.xml} where | |
467 @samp{@var{dist-dir}} is the directory containing the nXML | |
468 distribution. The first member will cause nXML mode to use a file | |
469 @samp{schemas.xml} in the same directory as the document being | |
470 edited if such a file exist. The second member contains rules for the | |
471 schemas that are included with the nXML distribution. | |
472 | |
473 @menu | |
474 * Commands for locating a schema:: | |
475 * Schema locating files:: | |
476 @end menu | |
477 | |
478 @node Commands for locating a schema | |
479 @section Commands for locating a schema | |
480 | |
481 The command @kbd{C-c C-s C-w} will tell you what schema | |
482 is currently being used. | |
483 | |
484 The rules for locating a schema are applied automatically when | |
485 you visit a file in nXML mode. However, if you have just created a new | |
486 file and the schema cannot be inferred from the file-name, then this | |
487 will not locate the right schema. In this case, you should insert the | |
488 start-tag of the root element and then use the command @kbd{C-c | |
489 C-a}, which reapplies the rules based on the current content of | |
490 the document. It is usually not necessary to insert the complete | |
491 start-tag; often just @samp{<@var{name}} is | |
492 enough. | |
493 | |
494 If you want to use a schema that has not yet been added to the | |
495 schema locating files, you can use the command @kbd{C-c C-s C-f} | |
496 to manually select the file contaiing the schema for the document in | |
497 current buffer. Emacs will read the file-name of the schema from the | |
498 minibuffer. After reading the file-name, Emacs will ask whether you | |
499 wish to add a rule to a schema locating file that persistently | |
500 associates the document with the selected schema. The rule will be | |
501 added to the first file in the list specified | |
502 @samp{rng-schema-locating-files}; it will create the file if | |
503 necessary, but will not create a directory. If the variable | |
504 @samp{rng-schema-locating-files} has not been customized, this | |
505 means that the rule will be added to the file @samp{schemas.xml} | |
506 in the same directory as the document being edited. | |
507 | |
508 The command @kbd{C-c C-s C-t} allows you to select a schema by | |
509 specifying an identifier for the type of the document. The schema | |
510 locating files determine the available type identifiers and what | |
511 schema is used for each type identifier. This is useful when it is | |
512 impossible to infer the right schema from either the file-name or the | |
513 content of the document, even though the schema is already in the | |
514 schema locating file. A situation in which this can occur is when | |
515 there are multiple variants of a schema where all valid documents have | |
516 the same document element. For example, XHTML has Strict and | |
517 Transitional variants. In a situation like this, a schema locating file | |
518 can define a type identifier for each variant. As with @kbd{C-c | |
519 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema | |
520 locating file that persistently associates the document with the | |
521 specified type identifier. | |
522 | |
523 The command @kbd{C-c C-s C-l} adds a rule to a schema | |
524 locating file that persistently associates the document with | |
525 the schema that is currently being used. | |
526 | |
527 @node Schema locating files | |
528 @section Schema locating files | |
529 | |
530 Each schema locating file specifies a list of rules. The rules | |
531 from each file are appended in order. To locate a schema each rule is | |
532 applied in turn until a rule matches. The first matching rule is then | |
533 used to determine the schema. | |
534 | |
535 Schema locating files are designed to be useful for other | |
536 applications that need to locate a schema for a document. In fact, | |
537 there is nothing specific to locating schemas in the design; it could | |
538 equally well be used for locating a stylesheet. | |
539 | |
540 @menu | |
541 * Schema locating file syntax basics:: | |
542 * Using the document's URI to locate a schema:: | |
543 * Using the document element to locate a schema:: | |
544 * Using type identifiers in schema locating files:: | |
545 * Using multiple schema locating files:: | |
546 @end menu | |
547 | |
548 @node Schema locating file syntax basics | |
549 @subsection Schema locating file syntax basics | |
550 | |
551 There is a schema for schema locating files in the file | |
552 @samp{locate.rnc} in the schema directory. Schema locating | |
553 files must be valid with respect to this schema. | |
554 | |
555 The document element of a schema locating file must be | |
556 @samp{locatingRules} and the namespace URI must be | |
557 @samp{http://thaiopensource.com/ns/locating-rules/1.0}. The | |
558 children of the document element specify rules. The order of the | |
559 children is the same as the order of the rules. Here's a complete | |
560 example of a schema locating file: | |
561 | |
562 @example | |
563 <?xml version="1.0"?> | |
564 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
565 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | |
566 <documentElement localName="book" uri="docbook.rnc"/> | |
567 </locatingRules> | |
568 @end example | |
569 | |
570 @noindent | |
571 This says to use the schema @samp{xhtml.rnc} for a document with | |
572 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the | |
573 schema @samp{docbook.rnc} for a document whose local name is | |
574 @samp{book}. If the document element had both a namespace URI | |
575 of @samp{http://www.w3.org/1999/xhtml} and a local name of | |
576 @samp{book}, then the matching rule that comes first will be | |
577 used and so the schema @samp{xhtml.rnc} would be used. There is | |
578 no precedence between different types of rule; the first matching rule | |
579 of any type is used. | |
580 | |
581 As usual with XML-related technologies, resources are identified | |
582 by URIs. The @samp{uri} attribute identifies the schema by | |
583 specifying the URI. The URI may be relative. If so, it is resolved | |
584 relative to the URI of the schema locating file that contains | |
585 attribute. This means that if the value of @samp{uri} attribute | |
586 does not contain a @samp{/}, then it will refer to a filename in | |
587 the same directory as the schema locating file. | |
588 | |
589 @node Using the document's URI to locate a schema | |
590 @subsection Using the document's URI to locate a schema | |
591 | |
592 A @samp{uri} rule locates a schema based on the URI of the | |
593 document. The @samp{uri} attribute specifies the URI of the | |
594 schema. The @samp{resource} attribute can be used to specify | |
595 the schema for a particular document. For example, | |
596 | |
597 @example | |
598 <uri resource="spec.xml" uri="docbook.rnc"/> | |
599 @end example | |
600 | |
601 @noindent | |
602 specifies that that the schema for @samp{spec.xml} is | |
603 @samp{docbook.rnc}. | |
604 | |
605 The @samp{pattern} attribute can be used instead of the | |
606 @samp{resource} attribute to specify the schema for any document | |
607 whose URI matches a pattern. The pattern has the same syntax as an | |
608 absolute or relative URI except that the path component of the URI can | |
609 use a @samp{*} character to stand for zero or more characters | |
610 within a path segment (i.e. any character other @samp{/}). | |
611 Typically, the URI pattern looks like a relative URI, but, whereas a | |
612 relative URI in the @samp{resource} attribute is resolved into a | |
613 particular absolute URI using the base URI of the schema locating | |
614 file, a relative URI pattern matches if it matches some number of | |
615 complete path segments of the document's URI ending with the last path | |
616 segment of the document's URI. For example, | |
617 | |
618 @example | |
619 <uri pattern="*.xsl" uri="xslt.rnc"/> | |
620 @end example | |
621 | |
622 @noindent | |
623 specifies that the schema for documents with a URI whose path ends | |
624 with @samp{.xsl} is @samp{xslt.rnc}. | |
625 | |
626 A @samp{transformURI} rule locates a schema by | |
627 transforming the URI of the document. The @samp{fromPattern} | |
628 attribute specifies a URI pattern with the same meaning as the | |
629 @samp{pattern} attribute of the @samp{uri} element. The | |
630 @samp{toPattern} attribute is a URI pattern that is used to | |
631 generate the URI of the schema. Each @samp{*} in the | |
632 @samp{toPattern} is replaced by the string that matched the | |
633 corresponding @samp{*} in the @samp{fromPattern}. The | |
634 resulting string is appended to the initial part of the document's URI | |
635 that was not explicitly matched by the @samp{fromPattern}. The | |
636 rule matches only if the transformed URI identifies an existing | |
637 resource. For example, the rule | |
638 | |
639 @example | |
640 <transformURI fromPattern="*.xml" toPattern="*.rnc"/> | |
641 @end example | |
642 | |
643 @noindent | |
644 would transform the URI @samp{file:///home/jjc/docs/spec.xml} | |
645 into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this | |
646 rule specifies that to locate a schema for a document | |
647 @samp{@var{foo}.xml}, Emacs should test whether a file | |
648 @samp{@var{foo}.rnc} exists in the same directory as | |
649 @samp{@var{foo}.xml}, and, if so, should use it as the | |
650 schema. | |
651 | |
652 @node Using the document element to locate a schema | |
653 @subsection Using the document element to locate a schema | |
654 | |
655 A @samp{documentElement} rule locates a schema based on | |
656 the local name and prefix of the document element. For example, a rule | |
657 | |
658 @example | |
659 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/> | |
660 @end example | |
661 | |
662 @noindent | |
663 specifies that when the name of the document element is | |
664 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used | |
665 as the schema. Either the @samp{prefix} or | |
666 @samp{localName} attribute may be omitted to allow any prefix or | |
667 local name. | |
668 | |
669 A @samp{namespace} rule locates a schema based on the | |
670 namespace URI of the document element. For example, a rule | |
671 | |
672 @example | |
673 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/> | |
674 @end example | |
675 | |
676 @noindent | |
677 specifies that when the namespace URI of the document is | |
678 @samp{http://www.w3.org/1999/XSL/Transform}, then | |
679 @samp{xslt.rnc} should be used as the schema. | |
680 | |
681 @node Using type identifiers in schema locating files | |
682 @subsection Using type identifiers in schema locating files | |
683 | |
684 Type identifiers allow a level of indirection in locating the | |
685 schema for a document. Instead of associating the document directly | |
686 with a schema URI, the document is associated with a type identifier, | |
687 which is in turn associated with a schema URI. nXML mode does not | |
688 constrain the format of type identifiers. They can be simply strings | |
689 without any formal structure or they can be public identifiers or | |
690 URIs. Note that these type identifiers have nothing to do with the | |
691 DOCTYPE declaration. When comparing type identifiers, whitespace is | |
692 normalized in the same way as with the @samp{xsd:token} | |
693 datatype: leading and trailing whitespace is stripped; other sequences | |
694 of whitespace are normalized to a single space character. | |
695 | |
696 Each of the rules described in previous sections that uses a | |
697 @samp{uri} attribute to specify a schema, can instead use a | |
698 @samp{typeId} attribute to specify a type identifier. The type | |
699 identifier can be associated with a URI using a @samp{typeId} | |
700 element. For example, | |
701 | |
702 @example | |
703 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
704 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/> | |
705 <typeId id="XHTML" typeId="XHTML Strict"/> | |
706 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/> | |
707 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/> | |
708 </locatingRules> | |
709 @end example | |
710 | |
711 @noindent | |
712 declares three type identifiers @samp{XHTML} (representing the | |
713 default variant of XHTML to be used), @samp{XHTML Strict} and | |
714 @samp{XHTML Transitional}. Such a schema locating file would | |
715 use @samp{xhtml-strict.rnc} for a document whose namespace is | |
716 @samp{http://www.w3.org/1999/xhtml}. But it is considerably | |
717 more flexible than a schema locating file that simply specified | |
718 | |
719 @example | |
720 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/> | |
721 @end example | |
722 | |
723 @noindent | |
724 A user can easily use @kbd{C-c C-s C-t} to select between XHTML | |
725 Strict and XHTML Transitional. Also, a user can easily add a catalog | |
726 | |
727 @example | |
728 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
729 <typeId id="XHTML" typeId="XHTML Transitional"/> | |
730 </locatingRules> | |
731 @end example | |
732 | |
733 @noindent | |
734 that makes the default variant of XHTML be XHTML Transitional. | |
735 | |
736 @node Using multiple schema locating files | |
737 @subsection Using multiple schema locating files | |
738 | |
739 The @samp{include} element includes rules from another | |
740 schema locating file. The behavior is exactly as if the rules from | |
741 that file were included in place of the @samp{include} element. | |
742 Relative URIs are resolved into absolute URIs before the inclusion is | |
743 performed. For example, | |
744 | |
745 @example | |
746 <include rules="../rules.xml"/> | |
747 @end example | |
748 | |
749 @noindent | |
750 includes the rules from @samp{rules.xml}. | |
751 | |
752 The process of locating a schema takes as input a list of schema | |
753 locating files. The rules in all these files and in the files they | |
754 include are resolved into a single list of rules, which are applied | |
755 strictly in order. Sometimes this order is not what is needed. | |
756 For example, suppose you have two schema locating files, a private | |
757 file | |
758 | |
759 @example | |
760 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
761 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | |
762 </locatingRules> | |
763 @end example | |
764 | |
765 @noindent | |
766 followed by a public file | |
767 | |
768 @example | |
769 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
770 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/> | |
771 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/> | |
772 </locatingRules> | |
773 @end example | |
774 | |
775 @noindent | |
776 The effect of these two files is that the XHTML @samp{namespace} | |
777 rule takes precedence over the @samp{transformURI} rule, which | |
778 is almost certainly not what is needed. This can be solved by adding | |
779 an @samp{applyFollowingRules} to the private file. | |
780 | |
781 @example | |
782 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
783 <applyFollowingRules ruleType="transformURI"/> | |
784 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | |
785 </locatingRules> | |
786 @end example | |
787 | |
788 @node DTDs | |
789 @chapter DTDs | |
790 | |
791 nxml-mode is designed to support the creation of standalone XML | |
792 documents that do not depend on a DTD. Although it is common practice | |
793 to insert a DOCTYPE declaration referencing an external DTD, this has | |
794 undesirable side-effects. It means that the document is no longer | |
795 self-contained. It also means that different XML parsers may interpret | |
796 the document in different ways, since the XML Recommendation does not | |
797 require XML parsers to read the DTD. With DTDs, it was impractical to | |
798 get validation without using an external DTD or reference to an | |
799 parameter entity. With RELAX NG and other schema languages, you can | |
800 simulataneously get the benefits of validation and standalone XML | |
801 documents. Therefore, I recommend that you do not reference an | |
802 external DOCTYPE in your XML documents. | |
803 | |
804 One problem is entities for characters. Typically, as well as | |
805 providing validation, DTDs also provide a set of character entities | |
806 for documents to use. Schemas cannot provide this functionality, | |
807 because schema validation happens after XML parsing. The recommended | |
808 solution is to either use the Unicode characters directly, or, if this | |
809 is impractical, use character references. nXML mode supports this by | |
810 providing commands for entering characters and character references | |
811 using the Unicode names, and can display the glyph corresponding to a | |
812 character reference. | |
813 | |
814 @node Limitations | |
815 @chapter Limitations | |
816 | |
817 nXML mode has some limitations: | |
818 | |
819 @itemize @bullet | |
820 @item | |
821 DTD support is limited. Internal parsed general entities declared | |
822 in the internal subset are supported provided they do not contain | |
823 elements. Other usage of DTDs is ignored. | |
824 @item | |
825 The restrictions on RELAX NG schemas in section 7 of the RELAX NG | |
826 specification are not enforced. | |
827 @item | |
828 Unicode support has problems. This stems mostly from the fact that | |
829 the XML (and RELAX NG) character model is based squarely on Unicode, | |
830 whereas the Emacs character model is not. Emacs 22 is slated to have | |
831 full Unicode support, which should improve the situation here. | |
832 @end itemize | |
833 | |
834 @bye |