Mercurial > emacs
view lisp/nxml/nxml-rap.el @ 107521:54f3a4d055ee
Document font-use-system-font.
* cmdargs.texi (Font X): Move most content to Fonts.
* frames.texi (Fonts): New node. Document font-use-system-font.
* emacs.texi (Top):
* xresources.texi (Table of Resources):
* mule.texi (Defining Fontsets, Charsets): Update xrefs.
| author | Chong Yidong <cyd@stupidchicken.com> |
|---|---|
| date | Sat, 20 Mar 2010 13:24:06 -0400 |
| parents | 1d1d5d9bd884 |
| children | 376148b31b5e |
line wrap: on
line source
;;; nxml-rap.el --- low-level support for random access parsing for nXML mode ;; Copyright (C) 2003, 2004, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. ;; Author: James Clark ;; Keywords: XML ;; This file is part of GNU Emacs. ;; GNU Emacs is free software: you can redistribute it and/or modify ;; it under the terms of the GNU General Public License as published by ;; the Free Software Foundation, either version 3 of the License, or ;; (at your option) any later version. ;; GNU Emacs is distributed in the hope that it will be useful, ;; but WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;; GNU General Public License for more details. ;; You should have received a copy of the GNU General Public License ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. ;;; Commentary: ;; This uses xmltok.el to do XML parsing. The fundamental problem is ;; how to handle changes. We don't want to maintain a complete parse ;; tree. We also don't want to reparse from the start of the document ;; on every keystroke. However, it is not possible in general to ;; parse an XML document correctly starting at a random point in the ;; middle. The main problems are comments, CDATA sections and ;; processing instructions: these can all contain things that are ;; indistinguishable from elements. Literals in the prolog are also a ;; problem. Attribute value literals are not a problem because ;; attribute value literals cannot contain less-than signs. ;; ;; Our strategy is to keep track of just the problematic things. ;; Specifically, we keep track of all comments, CDATA sections and ;; processing instructions in the instance. We do this by marking all ;; except the first character of these with a non-nil nxml-inside text ;; property. The value of the nxml-inside property is comment, ;; cdata-section or processing-instruction. The first character does ;; not have the nxml-inside property so we can find the beginning of ;; the construct by looking for a change in a text property value ;; (Emacs provides primitives for this). We use text properties ;; rather than overlays, since the implementation of overlays doesn't ;; look like it scales to large numbers of overlays in a buffer. ;; ;; We don't in fact track all these constructs, but only track them in ;; some initial part of the instance. The variable `nxml-scan-end' ;; contains the limit of where we have scanned up to for them. ;; ;; Thus to parse some random point in the file we first ensure that we ;; have scanned up to that point. Then we search backwards for a ;; <. Then we check whether the < has an nxml-inside property. If it ;; does we go backwards to first character that does not have an ;; nxml-inside property (this character must be a <). Then we start ;; parsing forward from the < we have found. ;; ;; The prolog has to be parsed specially, so we also keep track of the ;; end of the prolog in `nxml-prolog-end'. The prolog is reparsed on ;; every change to the prolog. This won't work well if people try to ;; edit huge internal subsets. Hopefully that will be rare. ;; ;; We keep track of the changes by adding to the buffer's ;; after-change-functions hook. Scanning is also done as a ;; prerequisite to fontification by adding to fontification-functions ;; (in the same way as jit-lock). This means that scanning for these ;; constructs had better be quick. Fortunately it is. Firstly, the ;; typical proportion of comments, CDATA sections and processing ;; instructions is small relative to other things. Secondly, to scan ;; we just search for the regexp <[!?]. ;; ;; One problem is unclosed comments, processing instructions and CDATA ;; sections. Suppose, for example, we encounter a <!-- but there's no ;; matching -->. This is not an unexpected situation if the user is ;; creating a comment. It is not helpful to treat the whole of the ;; file starting from the <!-- onwards as a single unclosed comment ;; token. Instead we treat just the <!-- as a piece of not well-formed ;; markup and continue. The problem is that if at some later stage a ;; --> gets added to the buffer after the unclosed <!--, we will need ;; to reparse the buffer starting from the <!--. We need to keep ;; track of these reparse dependencies; they are called dependent ;; regions in the code. ;;; Code: (require 'xmltok) (require 'nxml-util) (defvar nxml-prolog-end nil "Integer giving position following end of the prolog.") (make-variable-buffer-local 'nxml-prolog-end) (defvar nxml-scan-end nil "Marker giving position up to which we have scanned. nxml-scan-end must be >= nxml-prolog-end. Furthermore, nxml-scan-end must not be an inside position in the following sense. A position is inside if the following character is a part of, but not the first character of, a CDATA section, comment or processing instruction. Furthermore all positions >= nxml-prolog-end and < nxml-scan-end that are inside positions must have a non-nil `nxml-inside' property whose value is a symbol specifying what it is inside. Any characters with a non-nil `fontified' property must have position < nxml-scan-end and the correct face. Dependent regions must also be established for any unclosed constructs starting before nxml-scan-end. There must be no `nxml-inside' properties after nxml-scan-end.") (make-variable-buffer-local 'nxml-scan-end) (defsubst nxml-get-inside (pos) (get-text-property pos 'nxml-inside)) (defsubst nxml-clear-inside (start end) (nxml-debug-clear-inside start end) (remove-text-properties start end '(nxml-inside nil))) (defsubst nxml-set-inside (start end type) (nxml-debug-set-inside start end) (put-text-property start end 'nxml-inside type)) (defun nxml-inside-end (pos) "Return the end of the inside region containing POS. Return nil if the character at POS is not inside." (if (nxml-get-inside pos) (or (next-single-property-change pos 'nxml-inside) (point-max)) nil)) (defun nxml-inside-start (pos) "Return the start of the inside region containing POS. Return nil if the character at POS is not inside." (if (nxml-get-inside pos) (or (previous-single-property-change (1+ pos) 'nxml-inside) (point-min)) nil)) ;;; Change management (defun nxml-scan-after-change (start end) "Restore `nxml-scan-end' invariants after a change. The change happened between START and END. Return position after which lexical state is unchanged. END must be > `nxml-prolog-end'. START must be outside any 'inside' regions and at the beginning of a token." (if (>= start nxml-scan-end) nxml-scan-end (let ((inside-remove-start start) xmltok-errors xmltok-dependent-regions) (while (or (when (xmltok-forward-special (min end nxml-scan-end)) (when (memq xmltok-type '(comment cdata-section processing-instruction)) (nxml-clear-inside inside-remove-start (1+ xmltok-start)) (nxml-set-inside (1+ xmltok-start) (point) xmltok-type) (setq inside-remove-start (point))) (if (< (point) (min end nxml-scan-end)) t (setq end (point)) nil)) ;; The end of the change was inside but is now outside. ;; Imagine something really weird like ;; <![CDATA[foo <!-- bar ]]> <![CDATA[ stuff --> <!-- ]]> --> ;; and suppose we deleted "<