comparison lispref/files.texi @ 80890:6b44d05a5f0b

* elisp.texi (Top): Remove "Saving Properties" from detailed menu. * files.texi (Format Conversion): Expand intro; add menu. (Format Conversion Overview, Format Conversion Round-Trip) (Format Conversion Piecemeal): New nodes/subsections. * hooks.texi: Xref "Format Conversion" , not "Saving Properties". * text.texi (Text Properties): Remove "Saving Properties" from menu. (Saving Properties): Delete node/subsection.
author Thien-Thi Nguyen <ttn@gnuvola.org>
date Thu, 10 May 2007 08:43:12 +0000
parents 916f8aa2138d
children 776cb0a1bb24
comparison
equal deleted inserted replaced
80889:8938fa90afdb 80890:6b44d05a5f0b
372 @var{filename}. If the buffer is not visiting a file, it uses the 372 @var{filename}. If the buffer is not visiting a file, it uses the
373 buffer name instead. 373 buffer name instead.
374 @end deffn 374 @end deffn
375 375
376 Saving a buffer runs several hooks. It also performs format 376 Saving a buffer runs several hooks. It also performs format
377 conversion (@pxref{Format Conversion}), and may save text properties in 377 conversion (@pxref{Format Conversion}).
378 ``annotations'' (@pxref{Saving Properties}).
379 378
380 @defvar write-file-functions 379 @defvar write-file-functions
381 The value of this variable is a list of functions to be called before 380 The value of this variable is a list of functions to be called before
382 writing out a buffer to its visited file. If one of them returns 381 writing out a buffer to its visited file. If one of them returns
383 non-@code{nil}, the file is considered already written and the rest of 382 non-@code{nil}, the file is considered already written and the rest of
494 and the length of the data inserted. An error is signaled if 493 and the length of the data inserted. An error is signaled if
495 @var{filename} is not the name of a file that can be read. 494 @var{filename} is not the name of a file that can be read.
496 495
497 The function @code{insert-file-contents} checks the file contents 496 The function @code{insert-file-contents} checks the file contents
498 against the defined file formats, and converts the file contents if 497 against the defined file formats, and converts the file contents if
499 appropriate. @xref{Format Conversion}. It also calls the functions in 498 appropriate and also calls the functions in
500 the list @code{after-insert-file-functions}; see @ref{Saving 499 the list @code{after-insert-file-functions}. @xref{Format Conversion}.
501 Properties}. Normally, one of the functions in the 500 Normally, one of the functions in the
502 @code{after-insert-file-functions} list determines the coding system 501 @code{after-insert-file-functions} list determines the coding system
503 (@pxref{Coding Systems}) used for decoding the file's contents, 502 (@pxref{Coding Systems}) used for decoding the file's contents,
504 including end-of-line conversion. 503 including end-of-line conversion.
505 504
506 If @var{visit} is non-@code{nil}, this function additionally marks the 505 If @var{visit} is non-@code{nil}, this function additionally marks the
618 The optional argument @var{lockname}, if non-@code{nil}, specifies the 617 The optional argument @var{lockname}, if non-@code{nil}, specifies the
619 file name to use for purposes of locking and unlocking, overriding 618 file name to use for purposes of locking and unlocking, overriding
620 @var{filename} and @var{visit} for that purpose. 619 @var{filename} and @var{visit} for that purpose.
621 620
622 The function @code{write-region} converts the data which it writes to 621 The function @code{write-region} converts the data which it writes to
623 the appropriate file formats specified by @code{buffer-file-format}. 622 the appropriate file formats specified by @code{buffer-file-format}
624 @xref{Format Conversion}. It also calls the functions in the list 623 and also calls the functions in the list
625 @code{write-region-annotate-functions}; see @ref{Saving Properties}. 624 @code{write-region-annotate-functions}.
625 @xref{Format Conversion}.
626 626
627 Normally, @code{write-region} displays the message @samp{Wrote 627 Normally, @code{write-region} displays the message @samp{Wrote
628 @var{filename}} in the echo area. If @var{visit} is neither @code{t} 628 @var{filename}} in the echo area. If @var{visit} is neither @code{t}
629 nor @code{nil} nor a string, then this message is inhibited. This 629 nor @code{nil} nor a string, then this message is inhibited. This
630 feature is useful for programs that use files for internal purposes, 630 feature is useful for programs that use files for internal purposes,
2800 @section File Format Conversion 2800 @section File Format Conversion
2801 2801
2802 @cindex file format conversion 2802 @cindex file format conversion
2803 @cindex encoding file formats 2803 @cindex encoding file formats
2804 @cindex decoding file formats 2804 @cindex decoding file formats
2805 The variable @code{format-alist} defines a list of @dfn{file formats}, 2805 @cindex text properties in files
2806 which describe textual representations used in files for the data (text, 2806 @cindex saving text properties
2807 text-properties, and possibly other information) in an Emacs buffer. 2807 Emacs performs several steps to convert the data in a buffer (text,
2808 Emacs performs format conversion if appropriate when reading and writing 2808 text properties, and possibly other information) to and from a
2809 files. 2809 representation suitable for storing into a file. This section describes
2810 the fundamental functions that perform this @dfn{format conversion},
2811 namely @code{insert-file-contents} for reading a file into a buffer,
2812 and @code{write-region} for writing a buffer into a file.
2813
2814 @menu
2815 * Overview: Format Conversion Overview. @code{insert-file-contents} and @code{write-region}
2816 * Round-Trip: Format Conversion Round-Trip. Using @code{format-alist}.
2817 * Piecemeal: Format Conversion Piecemeal. Specifying non-paired conversion.
2818 @end menu
2819
2820 @node Format Conversion Overview
2821 @subsection Overview
2822 @noindent
2823 The function @code{insert-file-contents}:
2824
2825 @itemize
2826 @item initially, inserts bytes from the file into the buffer;
2827 @item decodes bytes to characters as appropriate;
2828 @item processes formats as defined by entries in @code{format-alist}; and
2829 @item calls functions in @code{after-insert-file-functions}.
2830 @end itemize
2831
2832 @noindent
2833 The function @code{write-region}:
2834
2835 @itemize
2836 @item initially, calls functions in @code{write-region-annotate-functions};
2837 @item processes formats as defined by entries in @code{format-alist};
2838 @item encodes characters to bytes as appropriate; and
2839 @item modifies the file with the bytes.
2840 @end itemize
2841
2842 This shows the symmetry of the lowest-level operations; reading and
2843 writing handle things in opposite order. The rest of this section
2844 describes the two facilities surrounding the three variables named
2845 above, as well as some related functions. @ref{Coding Systems}, for
2846 details on character encoding and decoding.
2847
2848 @node Format Conversion Round-Trip
2849 @subsection Round-Trip Specification
2850
2851 The most general of the two facilities is controlled by the variable
2852 @code{format-alist}, a list of @dfn{file format} specifications, which
2853 describe textual representations used in files for the data in an Emacs
2854 buffer. The descriptions for reading and writing are paired, which is
2855 why we call this ``round-trip'' specification
2856 (@pxref{Format Conversion Piecemeal}, for non-paired specification).
2810 2857
2811 @defvar format-alist 2858 @defvar format-alist
2812 This list contains one format definition for each defined file format. 2859 This list contains one format definition for each defined file format.
2860 Each format definition is a list of this form:
2861
2862 @example
2863 (@var{name} @var{doc-string} @var{regexp} @var{from-fn} @var{to-fn} @var{modify} @var{mode-fn})
2864 @end example
2813 @end defvar 2865 @end defvar
2814 2866
2815 @cindex format definition 2867 @cindex format definition
2816 Each format definition is a list of this form: 2868 @noindent
2817
2818 @example
2819 (@var{name} @var{doc-string} @var{regexp} @var{from-fn} @var{to-fn} @var{modify} @var{mode-fn})
2820 @end example
2821
2822 Here is what the elements in a format definition mean: 2869 Here is what the elements in a format definition mean:
2823 2870
2824 @table @var 2871 @table @var
2825 @item name 2872 @item name
2826 The name of this format. 2873 The name of this format.
2954 is @code{t}, the default, auto-saving uses the same format as a 3001 is @code{t}, the default, auto-saving uses the same format as a
2955 regular save in the same buffer. This variable is always buffer-local 3002 regular save in the same buffer. This variable is always buffer-local
2956 in all buffers. 3003 in all buffers.
2957 @end defvar 3004 @end defvar
2958 3005
3006 @node Format Conversion Piecemeal
3007 @subsection Piecemeal Specification
3008
3009 In contrast to the round-trip specification described in the previous
3010 subsection (@pxref{Format Conversion Round-Trip}), you can use the variables
3011 @code{after-insert-file-functions} and @code{write-region-annotate-functions}
3012 to separately control the respective reading and writing conversions.
3013
3014 Conversion starts with one representation and produces another
3015 representation. When there is only one conversion to do, there is no
3016 conflict about what to start with. However, when there are multiple
3017 conversions involved, conflict may arise when two conversions need to
3018 start with the same data.
3019
3020 This situation is best understood in the context of converting text
3021 properties during @code{write-region}. For example, the character at
3022 position 42 in a buffer is @samp{X} with a text property @code{foo}. If
3023 the conversion for @code{foo} is done by inserting into the buffer, say,
3024 @samp{FOO:}, then that changes the character at position 42 from
3025 @samp{X} to @samp{F}. The next conversion will start with the wrong
3026 data straight away.
3027
3028 To avoid conflict, cooperative conversions do not modify the buffer,
3029 but instead specify @dfn{annotations}, a list of elements of the form
3030 @code{(@var{position} . @var{string})}, sorted in order of increasing
3031 @var{position}.
3032
3033 If there is more than one conversion, @code{write-region} merges their
3034 annotations destructively into one sorted list. Later, when the text
3035 from the buffer is actually written to the file, it intermixes the
3036 specified annotations at the corresponding positions. All this takes
3037 place without modifying the buffer.
3038
3039 @c ??? What about ``overriding'' conversions like those allowed
3040 @c ??? for `write-region-annotate-functions', below? --ttn
3041
3042 In contrast, when reading, the annotations intermixed with the text
3043 are handled immediately. @code{insert-file-contents} sets point to the
3044 beginning of some text to be converted, then calls the conversion
3045 functions with the length of that text. These functions should always
3046 return with point at the beginning of the inserted text. This approach
3047 makes sense for reading because annotations removed by the first
3048 converter can't be mistakenly processed by a later converter.
3049
3050 Each conversion function should scan for the annotations it
3051 recognizes, remove the annotation, modify the buffer text (to set a text
3052 property, for example), and return the updated length of the text, as it
3053 stands after those changes. The value returned by one function becomes
3054 the argument to the next function.
3055
3056 @defvar write-region-annotate-functions
3057 A list of functions for @code{write-region} to call. Each function in
3058 the list is called with two arguments: the start and end of the region
3059 to be written. These functions should not alter the contents of the
3060 buffer. Instead, they should return annotations.
3061
3062 @c ??? Following adapted from comment in `build_annotations' (fileio.c).
3063 @c ??? Perhaps this is intended for internal use only?
3064 @c ??? Someone who understands this, please reword it. --ttn
3065 As a special case, if a function returns with a different buffer
3066 current, Emacs takes it to mean the current buffer contains altered text
3067 to be output, and discards all previous annotations because they should
3068 have been dealt with by this function.
3069 @end defvar
3070
3071 @defvar after-insert-file-functions
3072 Each function in this list is called by @code{insert-file-contents}
3073 with one argument, the number of characters inserted, and should
3074 return the new character count, leaving point the same.
3075 @c ??? The docstring mentions a handler from `file-name-handler-alist'
3076 @c "intercepting" `insert-file-contents'. Hmmm. --ttn
3077 @end defvar
3078
3079 We invite users to write Lisp programs to store and retrieve text
3080 properties in files, using these hooks, and thus to experiment with
3081 various data formats and find good ones. Eventually we hope users
3082 will produce good, general extensions we can install in Emacs.
3083
3084 We suggest not trying to handle arbitrary Lisp objects as text property
3085 names or values---because a program that general is probably difficult
3086 to write, and slow. Instead, choose a set of possible data types that
3087 are reasonably flexible, and not too hard to encode.
3088
2959 @ignore 3089 @ignore
2960 arch-tag: 141f74ce-6ae3-40dc-a6c4-ef83fc4ec35c 3090 arch-tag: 141f74ce-6ae3-40dc-a6c4-ef83fc4ec35c
2961 @end ignore 3091 @end ignore