Mercurial > emacs
comparison man/url.texi @ 58830:27baac8434ba
url.texi: New file.
author | Stefan Monnier <monnier@iro.umontreal.ca> |
---|---|
date | Tue, 07 Dec 2004 16:55:48 +0000 |
parents | |
children | d97ebd9e30f6 |
comparison
equal
deleted
inserted
replaced
58829:bf43c774d02c | 58830:27baac8434ba |
---|---|
1 \input texinfo | |
2 @setfilename url.info | |
3 @settitle URL Programmer's Manual | |
4 | |
5 @iftex | |
6 @c @finalout | |
7 @end iftex | |
8 @c @setchapternewpage odd | |
9 @c @smallbook | |
10 | |
11 @tex | |
12 \overfullrule=0pt | |
13 %\global\baselineskip 30pt % for printing in double space | |
14 @end tex | |
15 @dircategory World Wide Web | |
16 @dircategory GNU Emacs Lisp | |
17 @direntry | |
18 * URL: (url). URL loading package. | |
19 @end direntry | |
20 | |
21 @ifnottex | |
22 This file documents the URL loading package. | |
23 | |
24 Copyright (C) 1996, 1997, 1998, 1999, 2002, 2004 Free Software Foundation | |
25 Copyright (C) 1993, 1994, 1995, 1996 William M. Perry | |
26 | |
27 Permission is granted to copy, distribute and/or modify this document | |
28 under the terms of the GNU Free Documentation License, Version 1.1 or | |
29 any later version published by the Free Software Foundation; with the | |
30 Invariant Sections being | |
31 ``GNU GENERAL PUBLIC LICENSE''. A copy of the | |
32 license is included in the section entitled ``GNU Free Documentation | |
33 License.'' | |
34 @end ifnottex | |
35 | |
36 @c | |
37 @titlepage | |
38 @sp 6 | |
39 @center @titlefont{URL} | |
40 @center @titlefont{Programmer's Manual} | |
41 @sp 4 | |
42 @center First Edition, URL Version 2.0 | |
43 @sp 1 | |
44 @c @center December 1999 | |
45 @sp 5 | |
46 @center William M. Perry | |
47 @center @email{wmperry@@gnu.org} | |
48 @center David Love | |
49 @center @email{fx@@gnu.org} | |
50 @page | |
51 @vskip 0pt plus 1filll | |
52 Copyright @copyright{} 1993, 1994, 1995, 1996 William M. Perry@* | |
53 Copyright @copyright{} 1996, 1997, 1998, 1999, 2002 Free Software Foundation | |
54 | |
55 Permission is granted to copy, distribute and/or modify this document | |
56 under the terms of the GNU Free Documentation License, Version 1.1 or | |
57 any later version published by the Free Software Foundation; with the | |
58 Invariant Sections being | |
59 ``GNU GENERAL PUBLIC LICENSE''. A copy of the | |
60 license is included in the section entitled ``GNU Free Documentation | |
61 License.'' | |
62 @end titlepage | |
63 @page | |
64 @node Top | |
65 @top URL | |
66 | |
67 | |
68 | |
69 @menu | |
70 * Getting Started:: Preparing your program to use URLs. | |
71 * Retrieving URLs:: How to use this package to retrieve a URL. | |
72 * Supported URL Types:: Descriptions of URL types currently supported. | |
73 * Defining New URLs:: How to define a URL loader for a new protocol. | |
74 * General Facilities:: URLs can be cached, accessed via a gateway | |
75 and tracked in a history list. | |
76 * Customization:: Variables you can alter. | |
77 * Function Index:: | |
78 * Variable Index:: | |
79 * Concept Index:: | |
80 @end menu | |
81 | |
82 @node Getting Started | |
83 @chapter Getting Started | |
84 @cindex URLs, definition | |
85 @cindex URIs | |
86 | |
87 @dfn{Uniform Resource Locators} (URLs) are a specific form of | |
88 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which | |
89 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource | |
90 agents. | |
91 | |
92 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the | |
93 @var{scheme}s supported by this library are described below. | |
94 @xref{Supported URL Types}. | |
95 | |
96 FTP NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270, | |
97 IRC and gopher URLs all have the form | |
98 | |
99 @example | |
100 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]} | |
101 @end example | |
102 @noindent | |
103 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts. | |
104 @var{userinfo} sometimes takes the form @var{username}:@var{password} | |
105 but you should beware of the security risks of sending cleartext | |
106 passwords. @var{hostname} may be a domain name or a dotted decimal | |
107 address. If the @samp{:@var{port}} is omitted then the library will | |
108 use the `well known' port for that service when accessing URLs. With | |
109 the possible exception of @code{telnet}, it is rare for ports to be | |
110 specified, and it is possible using a non-standard port may have | |
111 undesired consequences if a different service is listening on that | |
112 port (e.g.@: an HTTP URL specifying the SMTP port can cause mail to be | |
113 sent).@c , but @xref{Other Variables, url-bad-port-list}. | |
114 The meaning of | |
115 the @var{path} component depends on the service. | |
116 | |
117 The library depends on MIME support provided by the @samp{mm-} | |
118 packages from Gnus 5.8 or later. @xref{(emacs-mime)Top, The MIME | |
119 library}. | |
120 | |
121 @menu | |
122 * Configuration:: | |
123 * Parsed URLs:: URLs are parsed into vector structures. | |
124 @end menu | |
125 | |
126 @node Configuration | |
127 @section Configuration | |
128 | |
129 @defvar url-configuration-directory | |
130 @cindex @file{~/.url} | |
131 @cindex configuration files | |
132 The directory in which URL configuration files, the cache etc., | |
133 reside. Default @file{~/.url}. | |
134 @end defvar | |
135 | |
136 @node Parsed URLs | |
137 @section Parsed URLs | |
138 @cindex parsed URLs | |
139 The library functions typically operate on @dfn{parsed} versions of | |
140 URLs. These are actually vectors of the form: | |
141 | |
142 @example | |
143 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}] | |
144 @end example | |
145 | |
146 @noindent where | |
147 @table @var | |
148 @item type | |
149 is the type of the URL scheme, e.g.@: @code{http} | |
150 @item user | |
151 is the username associated with it, or @code{nil}; | |
152 @item password | |
153 is the user password associated with it, or @code{nil}; | |
154 @item host | |
155 is the host name associated with it, or @code{nil}; | |
156 @item port | |
157 is the port number associated with it, or @code{nil}; | |
158 @item file | |
159 is the `file' part of it, or @code{nil}. This doesn't necessarily | |
160 actually refer to a file; | |
161 @item target | |
162 is the target part, or @code{nil}; | |
163 @item attributes | |
164 is the attributes associated with it, or @code{nil}; | |
165 @item full | |
166 is @code{t} for a fully-specified URL, with a host part indicated by | |
167 @samp{//} after the scheme part. | |
168 @end table | |
169 | |
170 @findex url-type | |
171 @findex url-user | |
172 @findex url-password | |
173 @findex url-host | |
174 @findex url-port | |
175 @findex url-file | |
176 @findex url-target | |
177 @findex url-attributes | |
178 @findex url-full | |
179 @findex url-set-type | |
180 @findex url-set-user | |
181 @findex url-set-password | |
182 @findex url-set-host | |
183 @findex url-set-port | |
184 @findex url-set-file | |
185 @findex url-set-target | |
186 @findex url-set-attributes | |
187 @findex url-set-full | |
188 These attributes have accessors named @code{url-@var{part}}, where | |
189 @var{part} is the name of one of the elements above, e.g.@: | |
190 @code{url-host}. Similarly, there are setters of the form | |
191 @code{url-set-@var{part}}. | |
192 | |
193 There are functions for parsing and unparsing between the string and | |
194 vector forms. | |
195 | |
196 @defun url-generic-parse-url url | |
197 Return a parsed version of the string @var{url}. | |
198 @end defun | |
199 | |
200 @defun url-recreate-url url | |
201 @cindex unparsing URLs | |
202 Recreates a URL string from the parsed @var{url}. | |
203 @end defun | |
204 | |
205 @node Retrieving URLs | |
206 @chapter Retrieving URLs | |
207 | |
208 @defun url-retrieve-synchronously url | |
209 Retrieve @var{url} synchronously and return a buffer containing the | |
210 data. @var{url} is either a string or a parsed URL structure. Return | |
211 @var{nil} if there are no data associated with it (the case for dired, | |
212 info, or mailto URLs that need no further processing). | |
213 @end defun | |
214 | |
215 @defun url-retrieve url callback &optional cbargs | |
216 Retrieve @var{url} asynchronously and call @var{callback} with args | |
217 @var{cbargs} when finished. The callback is called when the object | |
218 has been completely retrieved, with the current buffer containing the | |
219 object and any MIME headers associated with it. @var{url} is either a | |
220 string or a parsed URL structure. Returns the buffer @var{url} will | |
221 load into, or @var{nil} if the process has already completed. | |
222 @end defun | |
223 | |
224 @node Supported URL Types | |
225 @chapter Supported URL Types | |
226 | |
227 @menu | |
228 * http/https:: Hypertext Transfer Protocol. | |
229 * file/ftp:: Local files and FTP archives. | |
230 * info:: Emacs `Info' pages. | |
231 * mailto:: Sending email. | |
232 * news/nntp/snews:: Usenet news. | |
233 * rlogin/telnet/tn3270:: Remote host connectivity. | |
234 * irc:: Internet Relay Chat. | |
235 * data:: Embedded data URLs. | |
236 * nfs:: Networked File System | |
237 @c * finger:: | |
238 @c * gopher:: | |
239 @c * netrek:: | |
240 @c * prospero:: | |
241 * cid:: Content-ID. | |
242 * about:: | |
243 * ldap:: Lightweight Directory Access Protocol | |
244 * imap:: IMAP mailboxes. | |
245 * man:: Unix man pages. | |
246 @end menu | |
247 | |
248 @node http/https | |
249 @section @code{http} and @code{https} | |
250 | |
251 The scheme @code{http} is Hypertext Transfer Protocol. The library | |
252 supports version 1.1, specified in RFC 2616. (This supersedes 1.0, | |
253 defined in RFC 1945) HTTP URLs have the following form, where most of | |
254 the parts are optional: | |
255 @example | |
256 http://@var{user}:@var{password}@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment} | |
257 @end example | |
258 @c The @code{:@var{port}} part is optional, and @var{port} defaults to | |
259 @c 80. The @code{/@var{path}} part, if present, is a slash-separated | |
260 @c series elements. The @code{?@var{searchpart}}, if present, is the | |
261 @c query for a search or the content of a form submission. The | |
262 @c @code{#fragment} part, if present, is a location in the document. | |
263 | |
264 The scheme @code{https} is a secure version of @code{http}, with | |
265 transmission via SSL. It is defined in RFC 2069. Its default port is | |
266 443. This scheme depends on SSL support in Emacs via the | |
267 @file{ssl.el} library and is actually implemented by forcing the | |
268 @code{ssl} gateway method to be used. @xref{Gateways in general}. | |
269 | |
270 @defopt url-honor-refresh-requests | |
271 This controls honouring of HTTP @samp{Refresh} headers by which | |
272 servers can direct clients to reload documents from the same URL or a | |
273 or different one. @code{nil} means they will not be honoured, | |
274 @code{t} (the default) means they will always be honoured, and | |
275 otherwise the user will be asked on each request. | |
276 @end defopt | |
277 | |
278 | |
279 @menu | |
280 * Cookies:: | |
281 * HTTP language/coding:: | |
282 * HTTP URL Options:: | |
283 * Dealing with HTTP documents:: | |
284 @end menu | |
285 | |
286 @node Cookies | |
287 @subsection Cookies | |
288 | |
289 @defopt url-cookie-file | |
290 The file in which cookies are stored, defaulting to @file{cookies} in | |
291 the directory specified by @code{url-configuration-directory}. | |
292 @end defopt | |
293 | |
294 @defopt url-cookie-confirmation | |
295 Specifies whether confirmation is require to accept cookies. | |
296 @end defopt | |
297 | |
298 @defopt url-cookie-multiple-line | |
299 Specifies whether to put all cookies for the server on one line in the | |
300 HTTP request to satisfy broken servers like | |
301 @url{http://www.hotmail.com}. | |
302 @end defopt | |
303 | |
304 @defopt url-cookie-trusted-urls | |
305 A list of regular expressions matching URLs from which to accept | |
306 cookies always. | |
307 @end defopt | |
308 | |
309 @defopt url-cookie-untrusted-urls | |
310 A list of regular expressions matching URLs from which to reject | |
311 cookies always. | |
312 @end defopt | |
313 | |
314 @defopt url-cookie-save-interval | |
315 The number of seconds between automatic saves of cookies to disk. | |
316 Default is one hour. | |
317 @end defopt | |
318 | |
319 | |
320 @node HTTP language/coding | |
321 @subsection Language and Encoding Preferences | |
322 | |
323 HTTP allows clients to express preferences for the language and | |
324 encoding of documents which servers may honour. | |
325 | |
326 @defopt url-mime-charset-string | |
327 @cindex character sets | |
328 @cindex coding systems | |
329 This variable specifies a preference for character sets when documents | |
330 can be served in more than one encoding. | |
331 | |
332 HTTP allows specifying a list of MIME charsets which indicate your | |
333 preferred character set encodings, e.g.@: Latin-9 or Big5, and these | |
334 can be weighted. In Emacs 21 this list is generated automatically | |
335 from the list of defined coding systems which have associated MIME | |
336 types. These are sorted by coding priority. @xref{Recognize Coding, | |
337 , Recognizing Coding Systems, emacs, GNU Emacs Manual}. | |
338 @end defopt | |
339 | |
340 @defopt url-mime-language-string | |
341 @cindex language preferences | |
342 A string specifying the preferred language when servers can serve | |
343 files in several languages. Use RFC 1766 abbreviations, e.g.@: | |
344 @samp{en} for English, @samp{de} for German. It can be a | |
345 comma-separated list in descending order of preference. The ordering | |
346 can be made explicit using `q' factors defined by HTTP, e.g.@: | |
347 @w{@samp{de, en-gb;q=0.8, en;q=0.7}}. It can be @samp{*} to get the | |
348 first available language (as opposed to the default). | |
349 @end defopt | |
350 | |
351 @node HTTP URL Options | |
352 @subsection HTTP URL Options | |
353 | |
354 HTTP supports an @samp{OPTIONS} method describing things supported by | |
355 the URL@. | |
356 | |
357 @defun url-http-options url | |
358 Returns a property list describing options available for URL. The | |
359 property list members are: | |
360 | |
361 @table @code | |
362 @item methods | |
363 A list of symbols specifying what HTTP methods the resource | |
364 supports. | |
365 | |
366 @item dav | |
367 @cindex DAV | |
368 A list of numbers specifying what DAV protocol/schema versions are | |
369 supported. | |
370 | |
371 @item dasl | |
372 @cindex DASL | |
373 A list of supported DASL search types supported (string form). | |
374 | |
375 @item ranges | |
376 A list of the units available for use in partial document fetches. | |
377 | |
378 @item p3p | |
379 @cindex P3P | |
380 The @dfn{Platform For Privacy Protection} description for the resource. | |
381 Currently this is just the raw header contents. | |
382 @end table | |
383 | |
384 @end defun | |
385 | |
386 @node Dealing with HTTP documents | |
387 @subsection Dealing with HTTP documents | |
388 | |
389 HTTP URLs are retrieved into a buffer containing the HTTP headers | |
390 followed by the body. Since the headers are quasi-MIME, they may be | |
391 processed using the MIME library. @xref{(emacs-mime)Top, The MIME | |
392 library}. The MIME library doesn't provide a clean function to do | |
393 that, so the URL library does. | |
394 | |
395 @defun url-decode-text-part handle &optional coding | |
396 This function decodes charset-encoded text in the current buffer. In | |
397 Emacs, the buffer is expected to be unibyte initially and is set to | |
398 multibyte after decoding. | |
399 HANDLE is the MIME handle of the original part. CODING is an explicit | |
400 coding to use, overriding what the MIME headers specify. | |
401 The coding system used for the decoding is returned. | |
402 | |
403 Note that this function doesn't deal with @samp{http-equiv} charset | |
404 specifications in HTML @samp{<meta>} elements. | |
405 @end defun | |
406 | |
407 @node file/ftp | |
408 @section file and ftp | |
409 @cindex files | |
410 @cindex FTP | |
411 @cindex File Transfer Protocol | |
412 @cindex compressed files | |
413 @findex dired | |
414 | |
415 @example | |
416 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} | |
417 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} | |
418 @end example | |
419 | |
420 These schemes are defined in RFC 1808. | |
421 @samp{ftp:} and @samp{file:} are synonomous in this library. They | |
422 allow reading arbitary files from hosts. Either @samp{ange-ftp} | |
423 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote | |
424 hosts. Local files are accessed directly. | |
425 | |
426 Compressed files are handled, but support is hard-coded so that | |
427 @code{jka-compr-compression-info-list} and so on have no affect. | |
428 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and | |
429 @samp{.bz2}. | |
430 | |
431 @defopt url-directory-index-file | |
432 The filename to look for when indexing a directory, default | |
433 @samp{"index.html"}. If this file exists, and is readable, then it | |
434 will be viewed instead of using @code{dired} to view the directory. | |
435 @end defopt | |
436 | |
437 @node info | |
438 @section info | |
439 @cindex Info | |
440 @cindex Texinfo | |
441 @findex Info-goto-node | |
442 | |
443 @example | |
444 info:@var{file}#@var{node} | |
445 @end example | |
446 | |
447 Info URLs are not officially defined. They invoke | |
448 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}. | |
449 @samp{#@var{node}} is optional, defaulting to @samp{Top}. | |
450 | |
451 @node mailto | |
452 @section mailto | |
453 | |
454 @cindex mailto | |
455 @cindex email | |
456 A mailto URL will send an email message to the address in the | |
457 URL, for example @samp{mailto:foo@@bar.com} would compose a | |
458 message to @samp{foo@@bar.com}. | |
459 | |
460 @defopt url-mail-command | |
461 @vindex mail-user-agent | |
462 The function called whenever url needs to send mail. This should | |
463 normally be left to default from @var{mail-user-agent}. @xref{Mail | |
464 Methods, , Mail-Composition Methods, emacs, GNU Emacs Manual}. | |
465 @end defopt | |
466 | |
467 An @samp{X-Url-From} header field containing the URL of the document | |
468 that contained the mailto URL is added if that URL is known. | |
469 | |
470 RFC 2368 extends the definition of mailto URLs in RFC 1738. | |
471 The form of a mailto URL is | |
472 @example | |
473 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]} | |
474 @end example | |
475 @noindent where an arbitary number of @var{header}s can be added. If the | |
476 @var{header} is @samp{body}, then @var{contents} is put in the body | |
477 otherwise a @var{header} header field is created with @var{contents} | |
478 as its contents. Note that the URL library does not consider any | |
479 headers `dangerous' so you should check them before sending the | |
480 message. | |
481 | |
482 @c Fixme: update | |
483 Email messages are defined in @sc{rfc}822. | |
484 | |
485 @node news/nntp/snews | |
486 @section @code{news}, @code{nntp} and @code{snews} | |
487 @cindex news | |
488 @cindex network news | |
489 @cindex usenet | |
490 @cindex NNTP | |
491 @cindex snews | |
492 | |
493 @c draft-gilman-news-url-01 | |
494 The network news URL scheme take the following forms following RFC | |
495 1738 except that for compatibility with other clients, host and port | |
496 fields may be included in news URLs though they are properly only | |
497 allowed for nntp an snews. | |
498 | |
499 @table @samp | |
500 @item news:@var{newsgroup} | |
501 Retrieves a list of messages in @var{newsgroup}; | |
502 @item news:@var{message-id} | |
503 Retrieves the message with the given @var{message-id}; | |
504 @item news:* | |
505 Retrieves a list of all available newsgroups; | |
506 @item nntp://@var{host}:@var{port}/@var{newsgroup} | |
507 @itemx nntp://@var{host}:@var{port}/@var{message-id} | |
508 @itemx nntp://@var{host}:@var{port}/* | |
509 Similar to the @samp{news} versions. | |
510 @end table | |
511 | |
512 @samp{:@var{port}} is optional and defaults to :119. | |
513 | |
514 @samp{snews} is the same as @samp{nntp} except that the default port | |
515 is :563. | |
516 @cindex SSL | |
517 (It is tunnelled through SSL.) | |
518 | |
519 An @samp{nntp} URL is the same as a news URL, except that the URL may | |
520 specify an article by its number. | |
521 | |
522 @defopt url-news-server | |
523 This variable can be used to override the default news server. | |
524 Usually this will be set by the Gnus package, which is used to fetch | |
525 news. | |
526 @cindex environment variable | |
527 @vindex NNTPSERVER | |
528 It may be set from the conventional environment variable | |
529 @code{NNTPSERVER}. | |
530 @end defopt | |
531 | |
532 @node rlogin/telnet/tn3270 | |
533 @section rlogin, telnet and tn3270 | |
534 @cindex rlogin | |
535 @cindex telnet | |
536 @cindex tn3270 | |
537 @cindex terminal emulation | |
538 @findex terminal-emulator | |
539 | |
540 These URL schemes from RFC 1738 for logon via a terminal emulator have | |
541 the form | |
542 @example | |
543 telnet://@var{user}:@var{password}@@@var{host}:@var{port} | |
544 @end example | |
545 but the @code{:@var{password}} component is ignored. | |
546 | |
547 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin}, | |
548 @code{telnet} or @code{tn3270} (the program names and arguments are | |
549 hardcoded) session is run in a @code{terminal-emulator} buffer. | |
550 Well-known ports are used if the URL does not specify a port. | |
551 | |
552 @node irc | |
553 @section irc | |
554 @cindex IRC | |
555 @cindex Internet Relay Chat | |
556 @cindex ZEN IRC | |
557 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt) | |
558 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc} | |
559 session to a function named in @code{url-irc-function}. | |
560 | |
561 @defopt url-irc-function | |
562 A function to actually open an IRC connection. | |
563 This function | |
564 must take five arguments, @var{host}, @var{port}, @var{channel}, | |
565 @var{user} and @var{password}. The @var{channel} argument specifies the | |
566 channel to join immediately, this can be @code{nil}. By default this is | |
567 @code{url-irc-zenirc}. | |
568 @end defopt | |
569 @defun url-irc-zenirc host port channel user password | |
570 Processes the arguments and lets @code{zenirc} handle the session. | |
571 @end defun | |
572 | |
573 @node data | |
574 @section data | |
575 @cindex data URLs | |
576 | |
577 @example | |
578 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data} | |
579 @end example | |
580 | |
581 Data URLs contain MIME data in the URL itself. They are defined in | |
582 RFC 2397. | |
583 | |
584 @var{media-type} is a MIME @samp{Content-Type} string, possibly | |
585 including parameters. It defaults to | |
586 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be | |
587 omitted but the charset parameter supplied. If @samp{;base64} is | |
588 present, the @var{data} are base64-encoded. | |
589 | |
590 @node nfs | |
591 @section nfs | |
592 @cindex NFS | |
593 @cindex Network File System | |
594 @cindex automounter | |
595 | |
596 @example | |
597 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} | |
598 @end example | |
599 | |
600 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to | |
601 @samp{ftp:} except that it points to a file on a remote host that is | |
602 handled by the automounter on the local host. | |
603 | |
604 @defvar url-nfs-automounter-directory-spec | |
605 @end defvar | |
606 A string saying how to invoke the NFS automounter. Certain @samp{%} | |
607 sequences are recognized: | |
608 | |
609 @table @samp | |
610 @item %h | |
611 The hostname of the NFS server; | |
612 @item %n | |
613 The port number of the NFS server; | |
614 @item %u | |
615 The username to use to authenticate; | |
616 @item %p | |
617 The password to use to authenticate; | |
618 @item %f | |
619 The filename on the remote server; | |
620 @item %% | |
621 A literal @samp{%}. | |
622 @end table | |
623 | |
624 Each can be used any number of times. | |
625 | |
626 @node cid | |
627 @section cid | |
628 @cindex Content-ID | |
629 | |
630 RFC 2111 | |
631 | |
632 @node about | |
633 @section about | |
634 | |
635 @node ldap | |
636 @section ldap | |
637 @cindex LDAP | |
638 @cindex Lightweight Directory Access Protocol | |
639 | |
640 The LDAP scheme is defined in RFC 2255. | |
641 | |
642 @node imap | |
643 @section imap | |
644 @cindex IMAP | |
645 | |
646 RFC 2192 | |
647 | |
648 @node man | |
649 @section man | |
650 @cindex @command{man} | |
651 @cindex Unix man pages | |
652 @findex man | |
653 | |
654 @example | |
655 @samp{man:@var{page-spec}} | |
656 @end example | |
657 | |
658 This is a non-standard scheme. @var{page-spec} is passed directly to | |
659 the Lisp @code{man} function. | |
660 | |
661 @node Defining New URLs | |
662 @chapter Defining New URLs | |
663 | |
664 @menu | |
665 * Naming conventions:: | |
666 * Required functions:: | |
667 * Optional functions:: | |
668 * Asynchronous fetching:: | |
669 * Supporting file-name-handlers:: | |
670 @end menu | |
671 | |
672 @node Naming conventions | |
673 @section Naming conventions | |
674 | |
675 @node Required functions | |
676 @section Required functions | |
677 | |
678 @node Optional functions | |
679 @section Optional functions | |
680 | |
681 @node Asynchronous fetching | |
682 @section Asynchronous fetching | |
683 | |
684 @node Supporting file-name-handlers | |
685 @section Supporting file-name-handlers | |
686 | |
687 @node General Facilities | |
688 @chapter General Facilities | |
689 | |
690 @menu | |
691 * Disk Caching:: | |
692 * Proxies:: | |
693 * Gateways in general:: | |
694 * History:: | |
695 @end menu | |
696 | |
697 @node Disk Caching | |
698 @section Disk Caching | |
699 @cindex Caching | |
700 @cindex Persistent Cache | |
701 @cindex Disk Cache | |
702 | |
703 The disk cache stores retrieved documents locally, whence they can be | |
704 retrieved more quickly. When requesting a URL that is in the cache, | |
705 the library checks to see if the page has changed since it was last | |
706 retrieved from the remote machine. If not, the local copy is used, | |
707 saving the transmission over the network. | |
708 @cindex Cleaning the cache | |
709 @cindex Clearing the cache | |
710 @cindex Cache cleaning | |
711 Currently the cache isn't cleared automatically. | |
712 @c Running the @code{clean-cache} shell script | |
713 @c fist is recommended, to allow for future cleaning of the cache. This | |
714 @c shell script will remove all files that have not been accessed since it | |
715 @c was last run. To keep the cache pared down, it is recommended that this | |
716 @c script be run from @i{at} or @i{cron} (see the manual pages for | |
717 @c crontab(5) or at(1) for more information) | |
718 | |
719 @defopt url-automatic-caching | |
720 Setting this variable non-@code{nil} causes documents to be cached | |
721 automatically. | |
722 @end defopt | |
723 | |
724 @defopt url-cache-directory | |
725 This variable specifies the | |
726 directory to store the cache files. It defaults to sub-directory | |
727 @file{cache} of @code{url-configuration-directory}. | |
728 @end defopt | |
729 | |
730 @c Fixme: function v. option, but neither used. | |
731 @c @findex url-cache-expired | |
732 @c @defopt url-cache-expired | |
733 @c This is a function to decide whether or not a cache entry has expired. | |
734 @c It takes two times as it parameters and returns non-@code{nil} if the | |
735 @c second time is ``too old'' when compared with the first time. | |
736 @c @end defopt | |
737 | |
738 @defopt url-cache-creation-function | |
739 The cache relies on a scheme for mapping URLs to files in the cache. | |
740 This variable names a function which sets the type of cache to use. | |
741 It takes a URL as argument and returns the absolute file name of the | |
742 corresponding cache file. The two supplied possibilities are | |
743 @code{url-cache-create-filename-using-md5} and | |
744 @code{url-cache-create-filename-human-readable}. | |
745 @end defopt | |
746 | |
747 @defun url-cache-create-filename-using-md5 url | |
748 Creates a cache file name from @var{url} using MD5 hashing. | |
749 @findex md5 | |
750 This is creates entries with very few cache collisions and is fast if | |
751 you have the @code{md5} function as a primitive (Emacs 21 and XEmacs). | |
752 @smallexample | |
753 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar") | |
754 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f" | |
755 @end smallexample | |
756 @end defun | |
757 | |
758 @defun url-cache-create-filename-human-readable url | |
759 Creates a cache file name from @var{url} more obviously connected to | |
760 @var{url} than for @code{url-cache-create-filename-using-md5}, but | |
761 more likely to conflict with other files. | |
762 @smallexample | |
763 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar") | |
764 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar" | |
765 @end smallexample | |
766 @end defun | |
767 | |
768 @c Fixme: never actually used currently? | |
769 @c @defopt url-standalone-mode | |
770 @c @cindex Relying on cache | |
771 @c @cindex Cache only mode | |
772 @c @cindex Standalone mode | |
773 @c If this variable is non-@code{nil}, the library relies solely on the | |
774 @c cache for fetching documents and avoids checking if they have changed | |
775 @c on remote servers. | |
776 @c @end defopt | |
777 | |
778 @c With a large cache of documents on the local disk, it can be very handy | |
779 @c when traveling, or any other time the network connection is not active | |
780 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely | |
781 @c solely on its cache, and avoid checking to see if the page has changed | |
782 @c on the remote server. In the case of a dial-on-demand PPP connection, | |
783 @c this will keep the phone line free as long as possible, only bringing up | |
784 @c the PPP connection when asking for a page that is not located in the | |
785 @c cache. This is very useful for demonstrations as well. | |
786 | |
787 @node Proxies | |
788 @section Proxies and Gatewaying | |
789 | |
790 @c fixme: check/document url-ns stuff | |
791 @cindex proxy servers | |
792 @cindex proxies | |
793 @cindex environment variables | |
794 @vindex HTTP_PROXY | |
795 Proxy servers are commonly used to provide gateways through firewalls | |
796 or as caches serving some more-or-less local network. Each protocol | |
797 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is | |
798 conventionally configured commonly amongst different programs through | |
799 environment variables of the form @code{@var{protocol}_proxy}, where | |
800 @var{protocol} is one of the supported network protocols (@code{http}, | |
801 @code{ftp} etc.). The library recognizes such variables in either | |
802 upper or lower case. Their values are of one of the forms: | |
803 @itemize @bullet | |
804 @item @code{@var{host}:@var{port}} | |
805 @item A full URL; | |
806 @item Simply a host name. | |
807 @end itemize | |
808 | |
809 @vindex NO_PROXY | |
810 The @code{NO_PROXY} environment variable specifies URLs that should be | |
811 excluded from proxying (on servers that should be contacted directly). | |
812 This should be a comma-separated list of hostnames, domain names, or a | |
813 mixture of both. Asterisks can be used as wildcards, but other | |
814 clients may not support that. Domain names may be indicated by a | |
815 leading dot. For example: | |
816 @example | |
817 NO_PROXY="*.aventail.com,home.com,.seanet.com" | |
818 @end example | |
819 @noindent says to contact all machines in the @samp{aventail.com} and | |
820 @samp{seanet.com} domains directly, as well as the machine named | |
821 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY} | |
822 and @code{no_proxy} are also tried, in that order. | |
823 | |
824 Proxies may also be specified directly in Lisp. | |
825 | |
826 @defopt url-proxy-services | |
827 This variable is an alist of URL schemes and proxy servers that | |
828 gateway them. The items are of the form @w{@code{(@var{scheme} | |
829 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is | |
830 gatewayed through @var{portnumber} on the specified @var{host}. An | |
831 exception is the pseudo scheme @code{"no_proxy"}, which is paired with | |
832 a regexp matching host names not to be proxied. This variable is | |
833 initialized from the environment as above. | |
834 | |
835 @example | |
836 (setq url-proxy-services | |
837 '(("http" . "proxy.aventail.com:80") | |
838 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com"))) | |
839 @end example | |
840 @end defopt | |
841 | |
842 @node Gateways in general | |
843 @section Gateways in General | |
844 @cindex gateways | |
845 @cindex firewalls | |
846 | |
847 The library provides a general gateway layer through which all | |
848 networking passes. It can both control access to the network and | |
849 provide access through gateways in firewalls. This may make direct | |
850 connexions in some cases and pass through some sort of gateway in | |
851 others.@footnote{Proxies (which only operate over HTTP) are | |
852 implemented using this.} The library's basic function responsible for | |
853 making connexions is @code{url-open-stream}. | |
854 | |
855 @defun url-open-stream name buffer host service | |
856 @cindex opening a stream | |
857 @cindex stream, opening | |
858 Open a stream to @var{host}, possibly via a gateway. The other | |
859 arguments are as for @code{open-network-stream}. This will not make a | |
860 connexion if @code{url-gateway-unplugged} is non-@code{nil}. | |
861 @end defun | |
862 | |
863 @defvar url-gateway-local-host-regexp | |
864 This is a regular expression that matches local hosts that do not | |
865 require the use of a gateway. If @code{nil}, all connexions are made | |
866 through the gateway. | |
867 @end defvar | |
868 | |
869 @defvar url-gateway-method | |
870 This variable controls which gateway method is used. It may be useful | |
871 to bind it temporarily in some applications. It has values taken from | |
872 a list of symbols. Possible values are: | |
873 | |
874 @table @code | |
875 @item telnet | |
876 @cindex @command{telnet} | |
877 Use this method if you must first telnet and log into a gateway host, | |
878 and then run telnet from that host to connect to outside machines. | |
879 | |
880 @item rlogin | |
881 @cindex @command{rlogin} | |
882 This method is identical to @code{telnet}, but uses @command{rlogin} | |
883 to log into the remote machine without having to send the username and | |
884 password over the wire every time. | |
885 | |
886 @item socks | |
887 @cindex @sc{socks} | |
888 Use if the firewall has a @sc{socks} gateway running on it. The | |
889 @sc{socks} v5 protocol is defined in RFC 1928. | |
890 | |
891 @c @item ssl | |
892 @c This probably shouldn't be documented | |
893 @c Fixme: why not? -- fx | |
894 | |
895 @item native | |
896 This method uses Emacs's builtin networking directly. This is the | |
897 default. It can be used only if there is no firewall blocking access. | |
898 @end table | |
899 @end defvar | |
900 | |
901 The following variables control the gateway methods. | |
902 | |
903 @defopt url-gateway-telnet-host | |
904 The gateway host to telnet to. Once logged in there, you then telnet | |
905 out to the hosts you want to connect to. | |
906 @end defopt | |
907 @defopt url-gateway-telnet-parameters | |
908 This should be a list of parameters to pass to the @command{telnet} program. | |
909 @end defopt | |
910 @defopt url-gateway-telnet-password-prompt | |
911 This is a regular expression that matches the password prompt when | |
912 logging in. | |
913 @end defopt | |
914 @defopt url-gateway-telnet-login-prompt | |
915 This is a regular expression that matches the username prompt when | |
916 logging in. | |
917 @end defopt | |
918 @defopt url-gateway-telnet-user-name | |
919 The username to log in with. | |
920 @end defopt | |
921 @defopt url-gateway-telnet-password | |
922 The password to send when logging in. | |
923 @end defopt | |
924 @defopt url-gateway-prompt-pattern | |
925 This is a regular expression that matches the shell prompt. | |
926 @end defopt | |
927 | |
928 @defopt url-gateway-rlogin-host | |
929 Host to @samp{rlogin} to before telnetting out. | |
930 @end defopt | |
931 @defopt url-gateway-rlogin-parameters | |
932 Parametres to pass to @samp{rsh}. | |
933 @end defopt | |
934 @defopt url-gateway-rlogin-user-name | |
935 User name to use when logging in to the gateway. | |
936 @end defopt | |
937 @defopt url-gateway-prompt-pattern | |
938 This is a regular expression that matches the shell prompt. | |
939 @end defopt | |
940 | |
941 @defopt socks-server | |
942 This specifies the default server, it takes the form | |
943 @w{@code{("Default server" @var{server} @var{port} @var{version})}} | |
944 where @var{version} can be either 4 or 5. | |
945 @end defopt | |
946 @defvar socks-password | |
947 If this is @code{nil} then you will be asked for the passward, | |
948 otherwise it will be used as the password for authenticating you to | |
949 the @sc{socks} server. | |
950 @end defvar | |
951 @defvar socks-username | |
952 This is the username to use when authenticating yourself to the | |
953 @sc{socks} server. By default this is your login name. | |
954 @end defvar | |
955 @defvar socks-timeout | |
956 This controls how long, in seconds, to wait for responses from the | |
957 @sc{socks} server; it is 5 by default. | |
958 @end defvar | |
959 @c fixme: these have been effectively commented-out in the code | |
960 @c @defopt socks-server-aliases | |
961 @c This a list of server aliases. It is a list of aliases of the form | |
962 @c @var{(alias hostname port version)}. | |
963 @c @end defopt | |
964 @c @defopt socks-network-aliases | |
965 @c This a list of network aliases. Each entry in the list takes the form | |
966 @c @var{(alias (network))} where @var{alias} is a string that names the | |
967 @c @var{network}. The networks can contain a pair (not a dotted pair) of | |
968 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip} | |
969 @c address and a netmask, a domain name or a unique hostname or @sc{ip} | |
970 @c address. | |
971 @c @end defopt | |
972 @c @defopt socks-redirection-rules | |
973 @c This a list of redirection rules. Each rule take the form | |
974 @c @var{(Destination network Connection type)} where @var{Destination | |
975 @c network} is a network alias from @code{socks-network-aliases} and | |
976 @c @var{Connection type} can be @code{nil} in which case a direct | |
977 @c connection is used, or it can be an alias from | |
978 @c @code{socks-server-aliases} in which case that server is used as a | |
979 @c proxy. | |
980 @c @end defopt | |
981 @defopt socks-nslookup-program | |
982 @cindex @command{nslookup} | |
983 This the @samp{nslookup} program. It is @code{"nslookup"} by default. | |
984 @end defopt | |
985 | |
986 @menu | |
987 * Suppressing network connexions:: | |
988 @end menu | |
989 @c * Broken hostname resolution:: | |
990 | |
991 @node Suppressing network connexions | |
992 @subsection Suppressing Network Connexions | |
993 | |
994 @cindex network connexions, suppressing | |
995 @cindex suppressing network connexions | |
996 @cindex bugs, HTML | |
997 @cindex HTML `bugs' | |
998 In some circumstances it is desirable to suppress making network | |
999 connexions. A typical case is when rendering HTML in a mail user | |
1000 agent, when external URLs should not be activated, particularly to | |
1001 avoid `bugs' which `call home' by fetch single-pixel images and the | |
1002 like. To arrange this, bind the following variable for the duration | |
1003 of such processing. | |
1004 | |
1005 @defvar url-gateway-unplugged | |
1006 If this variable is non-@code{nil} new network connexions are never | |
1007 opened by the URL library. | |
1008 @end defvar | |
1009 | |
1010 @c @node Broken hostname resolution | |
1011 @c @subsection Broken Hostname Resolution | |
1012 | |
1013 @c @cindex hostname resolver | |
1014 @c @cindex resolver, hostname | |
1015 @c Some C libraries do not include the hostname resolver routines in | |
1016 @c their static libraries. If Emacs was linked statically, and was not | |
1017 @c linked with the resolver libraries, it wil not be able to get to any | |
1018 @c machines off the local network. This is characterized by being able | |
1019 @c to reach someplace with a raw ip number, but not its hostname | |
1020 @c (@url{http://129.79.254.191/} works, but | |
1021 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on | |
1022 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be | |
1023 @c rebuilt linked against the resolver library, it can use the external | |
1024 @c @command{nslookup} program instead. | |
1025 | |
1026 @c @defopt url-gateway-broken-resolution | |
1027 @c @cindex @code{nslookup} program | |
1028 @c @cindex program, @code{nslookup} | |
1029 @c If non-@code{nil}, this variable says to use the program specified by | |
1030 @c @code{url-gateway-nslookup-program} program to do hostname resolution. | |
1031 @c @end defopt | |
1032 | |
1033 @c @defopt url-gateway-nslookup-program | |
1034 @c The name of the program to do hostname lookup if Emacs can't do it | |
1035 @c directly. This program should expect a single argument on the command | |
1036 @c line---the hostname to resolve---and should produce output similar to | |
1037 @c the standard Unix @command{nslookup} program: | |
1038 @c @example | |
1039 @c Name: www.cs.indiana.edu | |
1040 @c Address: 129.79.254.191 | |
1041 @c @end example | |
1042 @c @end defopt | |
1043 | |
1044 @node History | |
1045 @section History | |
1046 | |
1047 The library can maintain a global history list tracking URLs accessed. | |
1048 URL completion can be done from it. The history mechanism is set up | |
1049 @findex url-do-setup | |
1050 automatically via @code{url-do-setup} when it is configured to be on. | |
1051 Note that the size of the history list is currently not limited. | |
1052 | |
1053 @vindex url-history-hash-table | |
1054 The history `list' is actually a hash table, | |
1055 @code{url-history-hash-table}. It contains access times keyed by URL | |
1056 strings. The times are in the format returned by @code{current-time}. | |
1057 | |
1058 @defun url-history-update-url url time | |
1059 This function updates the hsitory table with an entry for @var{url} | |
1060 accessed at the gievn @var{time}. | |
1061 @end defun | |
1062 | |
1063 @defopt url-history-track | |
1064 If non-@code{nil}, the library will keep track of all the URLs | |
1065 accessed. If is is @code{t}, the list is saved to disk at the end of | |
1066 each Emacs session. The default is @code{nil}. | |
1067 @end defopt | |
1068 | |
1069 @defopt url-history-file | |
1070 The file storing the history list between sessions. It defaults to | |
1071 @file{history} in @code{url-configuration-directory}. | |
1072 @end defopt | |
1073 | |
1074 @defopt url-history-save-interval | |
1075 @findex url-history-setup-save-timer | |
1076 The number of seconds between automatic saves of the history list. | |
1077 Default is one hour. Note that if you change this variable directly, | |
1078 rather than using Custom, after @code{url-do-setup} has been run, you | |
1079 need to run the function @code{url-history-setup-save-timer}. | |
1080 @end defopt | |
1081 | |
1082 @defun url-history-parse-history &optional fname | |
1083 Parses the history file @var{fname} (default @code{url-history-file}) | |
1084 and sets up the history list. | |
1085 @end defun | |
1086 | |
1087 @defun url-history-save-history &optional fname | |
1088 Saves the current history to file @var{fname} (default | |
1089 @code{url-history-file}). | |
1090 @end defun | |
1091 | |
1092 @defun url-completion-function string predicate function | |
1093 You can use this function to do completion of URLs from the history. | |
1094 @end defun | |
1095 | |
1096 @node Customization | |
1097 @chapter Customization | |
1098 | |
1099 @section Environment Variables | |
1100 | |
1101 @cindex environment variables | |
1102 The following environment variables affect the library's operation at | |
1103 startup. | |
1104 | |
1105 @table @code | |
1106 @item TMPDIR | |
1107 @vindex TMPDIR | |
1108 @vindex url-temporary-directory | |
1109 If this is defined, @var{url-temporary-directory} is initialized from | |
1110 it. | |
1111 @end table | |
1112 | |
1113 @section General User Options | |
1114 | |
1115 The following user options, settable with Customize, affect the | |
1116 general operation of the package. | |
1117 | |
1118 @defopt url-debug | |
1119 @cindex debugging | |
1120 Specifies the types of debug messages the library which are logged to | |
1121 the @code{*URL-DEBUG*} buffer. | |
1122 @code{t} means log all messages. | |
1123 A number means log all messages and show them with @code{message}. | |
1124 If may also be a list of the types of messages to be logged. | |
1125 @end defopt | |
1126 @defopt url-personal-mail-address | |
1127 @end defopt | |
1128 @defopt url-privacy-level | |
1129 @end defopt | |
1130 @defopt url-uncompressor-alist | |
1131 @end defopt | |
1132 @defopt url-passwd-entry-func | |
1133 @end defopt | |
1134 @defopt url-standalone-mode | |
1135 @end defopt | |
1136 @defopt url-bad-port-list | |
1137 @end defopt | |
1138 @defopt url-max-password-attempts | |
1139 @end defopt | |
1140 @defopt url-temporary-directory | |
1141 @end defopt | |
1142 @defopt url-show-status | |
1143 @end defopt | |
1144 @defopt url-confirmation-func | |
1145 The function to use for asking yes or no functions. This is normally | |
1146 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another | |
1147 function taking a single argument (the prompt) and returning @code{t} | |
1148 only if an affirmative answer is given. | |
1149 @end defopt | |
1150 @defopt url-gateway-method | |
1151 @c fixme: describe gatewaying | |
1152 A symbol specifying the type of gateway support to use fro connexions | |
1153 from the local machine. The supported methods are: | |
1154 | |
1155 @table @code | |
1156 @item telnet | |
1157 Run telnet in a subprocess to connect; | |
1158 @item rlogin | |
1159 Rlogin to another machine to connect; | |
1160 @item socks | |
1161 Connect through a socks server; | |
1162 @item ssl | |
1163 Connect with SSL; | |
1164 @item native | |
1165 Connect directly. | |
1166 @end table | |
1167 @end defopt | |
1168 | |
1169 @node Function Index | |
1170 @unnumbered Command and Function Index | |
1171 @printindex fn | |
1172 | |
1173 @node Variable Index | |
1174 @unnumbered Variable Index | |
1175 @printindex vr | |
1176 | |
1177 @node Concept Index | |
1178 @unnumbered Concept Index | |
1179 @printindex cp | |
1180 | |
1181 @setchapternewpage odd | |
1182 @contents | |
1183 @bye |