Mercurial > emacs
annotate doc/misc/url.texi @ 93217:e9fe0099fac2
(Contributors): Update my email.
author | Stefan Monnier <monnier@iro.umontreal.ca> |
---|---|
date | Wed, 26 Mar 2008 02:39:11 +0000 |
parents | 5d58981e6690 |
children | ef5e07e42359 |
rev | line source |
---|---|
84321 | 1 \input texinfo |
84329
3d431f1997d8
(setfilename): Go up one more level to ../../info.
Glenn Morris <rgm@gnu.org>
parents:
84321
diff
changeset
|
2 @setfilename ../../info/url |
84321 | 3 @settitle URL Programmer's Manual |
4 | |
5 @iftex | |
6 @c @finalout | |
7 @end iftex | |
8 @c @setchapternewpage odd | |
9 @c @smallbook | |
10 | |
11 @tex | |
12 \overfullrule=0pt | |
13 %\global\baselineskip 30pt % for printing in double space | |
14 @end tex | |
15 @dircategory World Wide Web | |
16 @dircategory GNU Emacs Lisp | |
17 @direntry | |
18 * URL: (url). URL loading package. | |
19 @end direntry | |
20 | |
21 @ifnottex | |
22 This file documents the URL loading package. | |
23 | |
24 Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002, | |
87903 | 25 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. |
84321 | 26 |
27 Permission is granted to copy, distribute and/or modify this document | |
28 under the terms of the GNU Free Documentation License, Version 1.2 or | |
29 any later version published by the Free Software Foundation; with the | |
30 Invariant Sections being | |
31 ``GNU GENERAL PUBLIC LICENSE''. A copy of the | |
32 license is included in the section entitled ``GNU Free Documentation | |
33 License.'' | |
34 @end ifnottex | |
35 | |
36 @c | |
37 @titlepage | |
38 @sp 6 | |
39 @center @titlefont{URL} | |
40 @center @titlefont{Programmer's Manual} | |
41 @sp 4 | |
42 @center First Edition, URL Version 2.0 | |
43 @sp 1 | |
44 @c @center December 1999 | |
45 @sp 5 | |
46 @center William M. Perry | |
47 @center @email{wmperry@@gnu.org} | |
48 @center David Love | |
49 @center @email{fx@@gnu.org} | |
50 @page | |
51 @vskip 0pt plus 1filll | |
52 Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002, | |
87903 | 53 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. |
84321 | 54 |
55 Permission is granted to copy, distribute and/or modify this document | |
56 under the terms of the GNU Free Documentation License, Version 1.2 or | |
57 any later version published by the Free Software Foundation; with the | |
58 Invariant Sections being | |
59 ``GNU GENERAL PUBLIC LICENSE''. A copy of the | |
60 license is included in the section entitled ``GNU Free Documentation | |
61 License.'' | |
62 @end titlepage | |
63 @page | |
64 @node Top | |
65 @top URL | |
66 | |
67 | |
68 | |
69 @menu | |
70 * Getting Started:: Preparing your program to use URLs. | |
71 * Retrieving URLs:: How to use this package to retrieve a URL. | |
72 * Supported URL Types:: Descriptions of URL types currently supported. | |
73 * Defining New URLs:: How to define a URL loader for a new protocol. | |
74 * General Facilities:: URLs can be cached, accessed via a gateway | |
75 and tracked in a history list. | |
76 * Customization:: Variables you can alter. | |
77 * GNU Free Documentation License:: The license for this documentation. | |
78 * Function Index:: | |
79 * Variable Index:: | |
80 * Concept Index:: | |
81 @end menu | |
82 | |
83 @node Getting Started | |
84 @chapter Getting Started | |
85 @cindex URLs, definition | |
86 @cindex URIs | |
87 | |
88 @dfn{Uniform Resource Locators} (URLs) are a specific form of | |
89 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which | |
90 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource | |
91 agents. | |
92 | |
93 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the | |
94 @var{scheme}s supported by this library are described below. | |
95 @xref{Supported URL Types}. | |
96 | |
97 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270, | |
98 IRC and gopher URLs all have the form | |
99 | |
100 @example | |
101 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]} | |
102 @end example | |
103 @noindent | |
104 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts. | |
105 @var{userinfo} sometimes takes the form @var{username}:@var{password} | |
106 but you should beware of the security risks of sending cleartext | |
107 passwords. @var{hostname} may be a domain name or a dotted decimal | |
108 address. If the @samp{:@var{port}} is omitted then the library will | |
109 use the `well known' port for that service when accessing URLs. With | |
110 the possible exception of @code{telnet}, it is rare for ports to be | |
111 specified, and it is possible using a non-standard port may have | |
112 undesired consequences if a different service is listening on that | |
113 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be | |
114 sent). @c , but @xref{Other Variables, url-bad-port-list}. | |
115 The meaning of the @var{path} component depends on the service. | |
116 | |
117 @menu | |
118 * Configuration:: | |
119 * Parsed URLs:: URLs are parsed into vector structures. | |
120 @end menu | |
121 | |
122 @node Configuration | |
123 @section Configuration | |
124 | |
125 @defvar url-configuration-directory | |
126 @cindex @file{~/.url} | |
127 @cindex configuration files | |
128 The directory in which URL configuration files, the cache etc., | |
129 reside. Default @file{~/.url}. | |
130 @end defvar | |
131 | |
132 @node Parsed URLs | |
133 @section Parsed URLs | |
134 @cindex parsed URLs | |
135 The library functions typically operate on @dfn{parsed} versions of | |
136 URLs. These are actually vectors of the form: | |
137 | |
138 @example | |
139 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}] | |
140 @end example | |
141 | |
142 @noindent where | |
143 @table @var | |
144 @item type | |
145 is the type of the URL scheme, e.g., @code{http} | |
146 @item user | |
147 is the username associated with it, or @code{nil}; | |
148 @item password | |
149 is the user password associated with it, or @code{nil}; | |
150 @item host | |
151 is the host name associated with it, or @code{nil}; | |
152 @item port | |
153 is the port number associated with it, or @code{nil}; | |
154 @item file | |
155 is the `file' part of it, or @code{nil}. This doesn't necessarily | |
156 actually refer to a file; | |
157 @item target | |
158 is the target part, or @code{nil}; | |
159 @item attributes | |
160 is the attributes associated with it, or @code{nil}; | |
161 @item full | |
162 is @code{t} for a fully-specified URL, with a host part indicated by | |
163 @samp{//} after the scheme part. | |
164 @end table | |
165 | |
166 @findex url-type | |
167 @findex url-user | |
168 @findex url-password | |
169 @findex url-host | |
170 @findex url-port | |
171 @findex url-file | |
172 @findex url-target | |
173 @findex url-attributes | |
174 @findex url-full | |
175 @findex url-set-type | |
176 @findex url-set-user | |
177 @findex url-set-password | |
178 @findex url-set-host | |
179 @findex url-set-port | |
180 @findex url-set-file | |
181 @findex url-set-target | |
182 @findex url-set-attributes | |
183 @findex url-set-full | |
184 These attributes have accessors named @code{url-@var{part}}, where | |
185 @var{part} is the name of one of the elements above, e.g., | |
186 @code{url-host}. Similarly, there are setters of the form | |
187 @code{url-set-@var{part}}. | |
188 | |
189 There are functions for parsing and unparsing between the string and | |
190 vector forms. | |
191 | |
192 @defun url-generic-parse-url url | |
193 Return a parsed version of the string @var{url}. | |
194 @end defun | |
195 | |
196 @defun url-recreate-url url | |
197 @cindex unparsing URLs | |
198 Recreates a URL string from the parsed @var{url}. | |
199 @end defun | |
200 | |
201 @node Retrieving URLs | |
202 @chapter Retrieving URLs | |
203 | |
204 @defun url-retrieve-synchronously url | |
205 Retrieve @var{url} synchronously and return a buffer containing the | |
206 data. @var{url} is either a string or a parsed URL structure. Return | |
207 @code{nil} if there are no data associated with it (the case for dired, | |
208 info, or mailto URLs that need no further processing). | |
209 @end defun | |
210 | |
211 @defun url-retrieve url callback &optional cbargs | |
212 Retrieve @var{url} asynchronously and call @var{callback} with args | |
213 @var{cbargs} when finished. The callback is called when the object | |
214 has been completely retrieved, with the current buffer containing the | |
215 object and any MIME headers associated with it. @var{url} is either a | |
216 string or a parsed URL structure. Returns the buffer @var{url} will | |
217 load into, or @code{nil} if the process has already completed. | |
218 @end defun | |
219 | |
220 @node Supported URL Types | |
221 @chapter Supported URL Types | |
222 | |
223 @menu | |
224 * http/https:: Hypertext Transfer Protocol. | |
225 * file/ftp:: Local files and FTP archives. | |
226 * info:: Emacs `Info' pages. | |
227 * mailto:: Sending email. | |
228 * news/nntp/snews:: Usenet news. | |
229 * rlogin/telnet/tn3270:: Remote host connectivity. | |
230 * irc:: Internet Relay Chat. | |
231 * data:: Embedded data URLs. | |
232 * nfs:: Networked File System | |
233 @c * finger:: | |
234 @c * gopher:: | |
235 @c * netrek:: | |
236 @c * prospero:: | |
237 * cid:: Content-ID. | |
238 * about:: | |
239 * ldap:: Lightweight Directory Access Protocol | |
240 * imap:: IMAP mailboxes. | |
241 * man:: Unix man pages. | |
242 @end menu | |
243 | |
244 @node http/https | |
245 @section @code{http} and @code{https} | |
246 | |
247 The scheme @code{http} is Hypertext Transfer Protocol. The library | |
248 supports version 1.1, specified in RFC 2616. (This supersedes 1.0, | |
249 defined in RFC 1945) HTTP URLs have the following form, where most of | |
250 the parts are optional: | |
251 @example | |
252 http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment} | |
253 @end example | |
254 @c The @code{:@var{port}} part is optional, and @var{port} defaults to | |
255 @c 80. The @code{/@var{path}} part, if present, is a slash-separated | |
256 @c series elements. The @code{?@var{searchpart}}, if present, is the | |
257 @c query for a search or the content of a form submission. The | |
258 @c @code{#fragment} part, if present, is a location in the document. | |
259 | |
260 The scheme @code{https} is a secure version of @code{http}, with | |
261 transmission via SSL. It is defined in RFC 2069. Its default port is | |
262 443. This scheme depends on SSL support in Emacs via the | |
263 @file{ssl.el} library and is actually implemented by forcing the | |
264 @code{ssl} gateway method to be used. @xref{Gateways in general}. | |
265 | |
266 @defopt url-honor-refresh-requests | |
267 This controls honouring of HTTP @samp{Refresh} headers by which | |
268 servers can direct clients to reload documents from the same URL or a | |
269 or different one. @code{nil} means they will not be honoured, | |
270 @code{t} (the default) means they will always be honoured, and | |
271 otherwise the user will be asked on each request. | |
272 @end defopt | |
273 | |
274 | |
275 @menu | |
276 * Cookies:: | |
277 * HTTP language/coding:: | |
278 * HTTP URL Options:: | |
279 * Dealing with HTTP documents:: | |
280 @end menu | |
281 | |
282 @node Cookies | |
283 @subsection Cookies | |
284 | |
285 @defopt url-cookie-file | |
286 The file in which cookies are stored, defaulting to @file{cookies} in | |
287 the directory specified by @code{url-configuration-directory}. | |
288 @end defopt | |
289 | |
290 @defopt url-cookie-confirmation | |
291 Specifies whether confirmation is require to accept cookies. | |
292 @end defopt | |
293 | |
294 @defopt url-cookie-multiple-line | |
295 Specifies whether to put all cookies for the server on one line in the | |
296 HTTP request to satisfy broken servers like | |
297 @url{http://www.hotmail.com}. | |
298 @end defopt | |
299 | |
300 @defopt url-cookie-trusted-urls | |
301 A list of regular expressions matching URLs from which to accept | |
302 cookies always. | |
303 @end defopt | |
304 | |
305 @defopt url-cookie-untrusted-urls | |
306 A list of regular expressions matching URLs from which to reject | |
307 cookies always. | |
308 @end defopt | |
309 | |
310 @defopt url-cookie-save-interval | |
311 The number of seconds between automatic saves of cookies to disk. | |
312 Default is one hour. | |
313 @end defopt | |
314 | |
315 | |
316 @node HTTP language/coding | |
317 @subsection Language and Encoding Preferences | |
318 | |
319 HTTP allows clients to express preferences for the language and | |
320 encoding of documents which servers may honour. For each of these | |
321 variables, the value is a string; it can specify a single choice, or | |
322 it can be a comma-separated list. | |
323 | |
324 Normally this list ordered by descending preference. However, each | |
325 element can be followed by @samp{;q=@var{priority}} to specify its | |
326 preference level, a decimal number from 0 to 1; e.g., for | |
327 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8, | |
328 en;q=0.7"}}. An element that has no @samp{;q} specification has | |
329 preference level 1. | |
330 | |
331 @defopt url-mime-charset-string | |
332 @cindex character sets | |
333 @cindex coding systems | |
334 This variable specifies a preference for character sets when documents | |
335 can be served in more than one encoding. | |
336 | |
337 HTTP allows specifying a series of MIME charsets which indicate your | |
338 preferred character set encodings, e.g., Latin-9 or Big5, and these | |
339 can be weighted. The default series is generated automatically from | |
340 the associated MIME types of all defined coding systems, sorted by the | |
341 coding system priority specified in Emacs. @xref{Recognize Coding, , | |
342 Recognizing Coding Systems, emacs, The GNU Emacs Manual}. | |
343 @end defopt | |
344 | |
345 @defopt url-mime-language-string | |
346 @cindex language preferences | |
347 A string specifying the preferred language when servers can serve | |
348 files in several languages. Use RFC 1766 abbreviations, e.g., | |
349 @samp{en} for English, @samp{de} for German. | |
350 | |
351 The string can be @code{"*"} to get the first available language (as | |
352 opposed to the default). | |
353 @end defopt | |
354 | |
355 @node HTTP URL Options | |
356 @subsection HTTP URL Options | |
357 | |
358 HTTP supports an @samp{OPTIONS} method describing things supported by | |
359 the URL@. | |
360 | |
361 @defun url-http-options url | |
362 Returns a property list describing options available for URL. The | |
363 property list members are: | |
364 | |
365 @table @code | |
366 @item methods | |
367 A list of symbols specifying what HTTP methods the resource | |
368 supports. | |
369 | |
370 @item dav | |
371 @cindex DAV | |
372 A list of numbers specifying what DAV protocol/schema versions are | |
373 supported. | |
374 | |
375 @item dasl | |
376 @cindex DASL | |
377 A list of supported DASL search types supported (string form). | |
378 | |
379 @item ranges | |
380 A list of the units available for use in partial document fetches. | |
381 | |
382 @item p3p | |
383 @cindex P3P | |
384 The @dfn{Platform For Privacy Protection} description for the resource. | |
385 Currently this is just the raw header contents. | |
386 @end table | |
387 | |
388 @end defun | |
389 | |
390 @node Dealing with HTTP documents | |
391 @subsection Dealing with HTTP documents | |
392 | |
393 HTTP URLs are retrieved into a buffer containing the HTTP headers | |
394 followed by the body. Since the headers are quasi-MIME, they may be | |
395 processed using the MIME library. @xref{Top,, Emacs MIME, | |
396 emacs-mime, The Emacs MIME Manual}. The URL package provides a | |
397 function to do this in general: | |
398 | |
399 @defun url-decode-text-part handle &optional coding | |
400 This function decodes charset-encoded text in the current buffer. In | |
401 Emacs, the buffer is expected to be unibyte initially and is set to | |
402 multibyte after decoding. | |
403 HANDLE is the MIME handle of the original part. CODING is an explicit | |
404 coding to use, overriding what the MIME headers specify. | |
405 The coding system used for the decoding is returned. | |
406 | |
407 Note that this function doesn't deal with @samp{http-equiv} charset | |
408 specifications in HTML @samp{<meta>} elements. | |
409 @end defun | |
410 | |
411 @node file/ftp | |
412 @section file and ftp | |
413 @cindex files | |
414 @cindex FTP | |
415 @cindex File Transfer Protocol | |
416 @cindex compressed files | |
417 @cindex dired | |
418 | |
419 @example | |
420 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} | |
421 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} | |
422 @end example | |
423 | |
424 These schemes are defined in RFC 1808. | |
425 @samp{ftp:} and @samp{file:} are synonymous in this library. They | |
426 allow reading arbitrary files from hosts. Either @samp{ange-ftp} | |
427 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote | |
428 hosts. Local files are accessed directly. | |
429 | |
430 Compressed files are handled, but support is hard-coded so that | |
431 @code{jka-compr-compression-info-list} and so on have no affect. | |
432 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and | |
433 @samp{.bz2}. | |
434 | |
435 @defopt url-directory-index-file | |
436 The filename to look for when indexing a directory, default | |
437 @samp{"index.html"}. If this file exists, and is readable, then it | |
438 will be viewed instead of using @code{dired} to view the directory. | |
439 @end defopt | |
440 | |
441 @node info | |
442 @section info | |
443 @cindex Info | |
444 @cindex Texinfo | |
445 @findex Info-goto-node | |
446 | |
447 @example | |
448 info:@var{file}#@var{node} | |
449 @end example | |
450 | |
451 Info URLs are not officially defined. They invoke | |
452 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}. | |
453 @samp{#@var{node}} is optional, defaulting to @samp{Top}. | |
454 | |
455 @node mailto | |
456 @section mailto | |
457 | |
458 @cindex mailto | |
459 @cindex email | |
460 A mailto URL will send an email message to the address in the | |
461 URL, for example @samp{mailto:foo@@bar.com} would compose a | |
462 message to @samp{foo@@bar.com}. | |
463 | |
464 @defopt url-mail-command | |
465 @vindex mail-user-agent | |
466 The function called whenever url needs to send mail. This should | |
467 normally be left to default from @var{mail-user-agent}. @xref{Mail | |
468 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}. | |
469 @end defopt | |
470 | |
471 An @samp{X-Url-From} header field containing the URL of the document | |
472 that contained the mailto URL is added if that URL is known. | |
473 | |
474 RFC 2368 extends the definition of mailto URLs in RFC 1738. | |
475 The form of a mailto URL is | |
476 @example | |
477 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]} | |
478 @end example | |
479 @noindent where an arbitrary number of @var{header}s can be added. If the | |
480 @var{header} is @samp{body}, then @var{contents} is put in the body | |
481 otherwise a @var{header} header field is created with @var{contents} | |
482 as its contents. Note that the URL library does not consider any | |
483 headers `dangerous' so you should check them before sending the | |
484 message. | |
485 | |
486 @c Fixme: update | |
487 Email messages are defined in @sc{rfc}822. | |
488 | |
489 @node news/nntp/snews | |
490 @section @code{news}, @code{nntp} and @code{snews} | |
491 @cindex news | |
492 @cindex network news | |
493 @cindex usenet | |
494 @cindex NNTP | |
495 @cindex snews | |
496 | |
497 @c draft-gilman-news-url-01 | |
498 The network news URL scheme take the following forms following RFC | |
499 1738 except that for compatibility with other clients, host and port | |
500 fields may be included in news URLs though they are properly only | |
501 allowed for nntp an snews. | |
502 | |
503 @table @samp | |
504 @item news:@var{newsgroup} | |
505 Retrieves a list of messages in @var{newsgroup}; | |
506 @item news:@var{message-id} | |
507 Retrieves the message with the given @var{message-id}; | |
508 @item news:* | |
509 Retrieves a list of all available newsgroups; | |
510 @item nntp://@var{host}:@var{port}/@var{newsgroup} | |
511 @itemx nntp://@var{host}:@var{port}/@var{message-id} | |
512 @itemx nntp://@var{host}:@var{port}/* | |
513 Similar to the @samp{news} versions. | |
514 @end table | |
515 | |
516 @samp{:@var{port}} is optional and defaults to :119. | |
517 | |
518 @samp{snews} is the same as @samp{nntp} except that the default port | |
519 is :563. | |
520 @cindex SSL | |
521 (It is tunneled through SSL.) | |
522 | |
523 An @samp{nntp} URL is the same as a news URL, except that the URL may | |
524 specify an article by its number. | |
525 | |
526 @defopt url-news-server | |
527 This variable can be used to override the default news server. | |
528 Usually this will be set by the Gnus package, which is used to fetch | |
529 news. | |
530 @cindex environment variable | |
531 @vindex NNTPSERVER | |
532 It may be set from the conventional environment variable | |
533 @code{NNTPSERVER}. | |
534 @end defopt | |
535 | |
536 @node rlogin/telnet/tn3270 | |
537 @section rlogin, telnet and tn3270 | |
538 @cindex rlogin | |
539 @cindex telnet | |
540 @cindex tn3270 | |
541 @cindex terminal emulation | |
542 @findex terminal-emulator | |
543 | |
544 These URL schemes from RFC 1738 for logon via a terminal emulator have | |
545 the form | |
546 @example | |
547 telnet://@var{user}:@var{password}@@@var{host}:@var{port} | |
548 @end example | |
549 but the @code{:@var{password}} component is ignored. | |
550 | |
551 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin}, | |
552 @code{telnet} or @code{tn3270} (the program names and arguments are | |
553 hardcoded) session is run in a @code{terminal-emulator} buffer. | |
554 Well-known ports are used if the URL does not specify a port. | |
555 | |
556 @node irc | |
557 @section irc | |
558 @cindex IRC | |
559 @cindex Internet Relay Chat | |
560 @cindex ZEN IRC | |
561 @cindex ERC | |
562 @cindex rcirc | |
563 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt) | |
564 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc} | |
565 session to a function named in @code{url-irc-function}. | |
566 | |
567 @defopt url-irc-function | |
568 A function to actually open an IRC connection. | |
569 This function | |
570 must take five arguments, @var{host}, @var{port}, @var{channel}, | |
571 @var{user} and @var{password}. The @var{channel} argument specifies the | |
572 channel to join immediately, this can be @code{nil}. By default this is | |
573 @code{url-irc-rcirc}. | |
574 @end defopt | |
575 @defun url-irc-rcirc host port channel user password | |
576 Processes the arguments and lets @code{rcirc} handle the session. | |
577 @end defun | |
578 @defun url-irc-erc host port channel user password | |
579 Processes the arguments and lets @code{ERC} handle the session. | |
580 @end defun | |
581 @defun url-irc-zenirc host port channel user password | |
582 Processes the arguments and lets @code{zenirc} handle the session. | |
583 @end defun | |
584 | |
585 @node data | |
586 @section data | |
587 @cindex data URLs | |
588 | |
589 @example | |
590 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data} | |
591 @end example | |
592 | |
593 Data URLs contain MIME data in the URL itself. They are defined in | |
594 RFC 2397. | |
595 | |
596 @var{media-type} is a MIME @samp{Content-Type} string, possibly | |
597 including parameters. It defaults to | |
598 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be | |
599 omitted but the charset parameter supplied. If @samp{;base64} is | |
600 present, the @var{data} are base64-encoded. | |
601 | |
602 @node nfs | |
603 @section nfs | |
604 @cindex NFS | |
605 @cindex Network File System | |
606 @cindex automounter | |
607 | |
608 @example | |
609 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} | |
610 @end example | |
611 | |
612 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to | |
613 @samp{ftp:} except that it points to a file on a remote host that is | |
614 handled by the automounter on the local host. | |
615 | |
616 @defvar url-nfs-automounter-directory-spec | |
617 @end defvar | |
618 A string saying how to invoke the NFS automounter. Certain @samp{%} | |
619 sequences are recognized: | |
620 | |
621 @table @samp | |
622 @item %h | |
623 The hostname of the NFS server; | |
624 @item %n | |
625 The port number of the NFS server; | |
626 @item %u | |
627 The username to use to authenticate; | |
628 @item %p | |
629 The password to use to authenticate; | |
630 @item %f | |
631 The filename on the remote server; | |
632 @item %% | |
633 A literal @samp{%}. | |
634 @end table | |
635 | |
636 Each can be used any number of times. | |
637 | |
638 @node cid | |
639 @section cid | |
640 @cindex Content-ID | |
641 | |
642 RFC 2111 | |
643 | |
644 @node about | |
645 @section about | |
646 | |
647 @node ldap | |
648 @section ldap | |
649 @cindex LDAP | |
650 @cindex Lightweight Directory Access Protocol | |
651 | |
652 The LDAP scheme is defined in RFC 2255. | |
653 | |
654 @node imap | |
655 @section imap | |
656 @cindex IMAP | |
657 | |
658 RFC 2192 | |
659 | |
660 @node man | |
661 @section man | |
662 @cindex @command{man} | |
663 @cindex Unix man pages | |
664 @findex man | |
665 | |
666 @example | |
667 @samp{man:@var{page-spec}} | |
668 @end example | |
669 | |
670 This is a non-standard scheme. @var{page-spec} is passed directly to | |
671 the Lisp @code{man} function. | |
672 | |
673 @node Defining New URLs | |
674 @chapter Defining New URLs | |
675 | |
676 @menu | |
677 * Naming conventions:: | |
678 * Required functions:: | |
679 * Optional functions:: | |
680 * Asynchronous fetching:: | |
681 * Supporting file-name-handlers:: | |
682 @end menu | |
683 | |
684 @node Naming conventions | |
685 @section Naming conventions | |
686 | |
687 @node Required functions | |
688 @section Required functions | |
689 | |
690 @node Optional functions | |
691 @section Optional functions | |
692 | |
693 @node Asynchronous fetching | |
694 @section Asynchronous fetching | |
695 | |
696 @node Supporting file-name-handlers | |
697 @section Supporting file-name-handlers | |
698 | |
699 @node General Facilities | |
700 @chapter General Facilities | |
701 | |
702 @menu | |
703 * Disk Caching:: | |
704 * Proxies:: | |
705 * Gateways in general:: | |
706 * History:: | |
707 @end menu | |
708 | |
709 @node Disk Caching | |
710 @section Disk Caching | |
711 @cindex Caching | |
712 @cindex Persistent Cache | |
713 @cindex Disk Cache | |
714 | |
715 The disk cache stores retrieved documents locally, whence they can be | |
716 retrieved more quickly. When requesting a URL that is in the cache, | |
717 the library checks to see if the page has changed since it was last | |
718 retrieved from the remote machine. If not, the local copy is used, | |
719 saving the transmission over the network. | |
720 @cindex Cleaning the cache | |
721 @cindex Clearing the cache | |
722 @cindex Cache cleaning | |
723 Currently the cache isn't cleared automatically. | |
724 @c Running the @code{clean-cache} shell script | |
725 @c fist is recommended, to allow for future cleaning of the cache. This | |
726 @c shell script will remove all files that have not been accessed since it | |
727 @c was last run. To keep the cache pared down, it is recommended that this | |
728 @c script be run from @i{at} or @i{cron} (see the manual pages for | |
729 @c crontab(5) or at(1) for more information) | |
730 | |
731 @defopt url-automatic-caching | |
732 Setting this variable non-@code{nil} causes documents to be cached | |
733 automatically. | |
734 @end defopt | |
735 | |
736 @defopt url-cache-directory | |
737 This variable specifies the | |
738 directory to store the cache files. It defaults to sub-directory | |
739 @file{cache} of @code{url-configuration-directory}. | |
740 @end defopt | |
741 | |
742 @c Fixme: function v. option, but neither used. | |
743 @c @findex url-cache-expired | |
744 @c @defopt url-cache-expired | |
745 @c This is a function to decide whether or not a cache entry has expired. | |
746 @c It takes two times as it parameters and returns non-@code{nil} if the | |
747 @c second time is ``too old'' when compared with the first time. | |
748 @c @end defopt | |
749 | |
750 @defopt url-cache-creation-function | |
751 The cache relies on a scheme for mapping URLs to files in the cache. | |
752 This variable names a function which sets the type of cache to use. | |
753 It takes a URL as argument and returns the absolute file name of the | |
754 corresponding cache file. The two supplied possibilities are | |
755 @code{url-cache-create-filename-using-md5} and | |
756 @code{url-cache-create-filename-human-readable}. | |
757 @end defopt | |
758 | |
759 @defun url-cache-create-filename-using-md5 url | |
760 Creates a cache file name from @var{url} using MD5 hashing. | |
761 This is creates entries with very few cache collisions and is fast. | |
762 @cindex MD5 | |
763 @smallexample | |
764 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar") | |
765 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f" | |
766 @end smallexample | |
767 @end defun | |
768 | |
769 @defun url-cache-create-filename-human-readable url | |
770 Creates a cache file name from @var{url} more obviously connected to | |
771 @var{url} than for @code{url-cache-create-filename-using-md5}, but | |
772 more likely to conflict with other files. | |
773 @smallexample | |
774 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar") | |
775 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar" | |
776 @end smallexample | |
777 @end defun | |
778 | |
779 @c Fixme: never actually used currently? | |
780 @c @defopt url-standalone-mode | |
781 @c @cindex Relying on cache | |
782 @c @cindex Cache only mode | |
783 @c @cindex Standalone mode | |
784 @c If this variable is non-@code{nil}, the library relies solely on the | |
785 @c cache for fetching documents and avoids checking if they have changed | |
786 @c on remote servers. | |
787 @c @end defopt | |
788 | |
789 @c With a large cache of documents on the local disk, it can be very handy | |
790 @c when traveling, or any other time the network connection is not active | |
791 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely | |
792 @c solely on its cache, and avoid checking to see if the page has changed | |
793 @c on the remote server. In the case of a dial-on-demand PPP connection, | |
794 @c this will keep the phone line free as long as possible, only bringing up | |
795 @c the PPP connection when asking for a page that is not located in the | |
796 @c cache. This is very useful for demonstrations as well. | |
797 | |
798 @node Proxies | |
799 @section Proxies and Gatewaying | |
800 | |
801 @c fixme: check/document url-ns stuff | |
802 @cindex proxy servers | |
803 @cindex proxies | |
804 @cindex environment variables | |
805 @vindex HTTP_PROXY | |
806 Proxy servers are commonly used to provide gateways through firewalls | |
807 or as caches serving some more-or-less local network. Each protocol | |
808 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is | |
809 conventionally configured commonly amongst different programs through | |
810 environment variables of the form @code{@var{protocol}_proxy}, where | |
811 @var{protocol} is one of the supported network protocols (@code{http}, | |
812 @code{ftp} etc.). The library recognizes such variables in either | |
813 upper or lower case. Their values are of one of the forms: | |
814 @itemize @bullet | |
815 @item @code{@var{host}:@var{port}} | |
816 @item A full URL; | |
817 @item Simply a host name. | |
818 @end itemize | |
819 | |
820 @vindex NO_PROXY | |
821 The @code{NO_PROXY} environment variable specifies URLs that should be | |
822 excluded from proxying (on servers that should be contacted directly). | |
823 This should be a comma-separated list of hostnames, domain names, or a | |
824 mixture of both. Asterisks can be used as wildcards, but other | |
825 clients may not support that. Domain names may be indicated by a | |
826 leading dot. For example: | |
827 @example | |
828 NO_PROXY="*.aventail.com,home.com,.seanet.com" | |
829 @end example | |
830 @noindent says to contact all machines in the @samp{aventail.com} and | |
831 @samp{seanet.com} domains directly, as well as the machine named | |
832 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY} | |
833 and @code{no_proxy} are also tried, in that order. | |
834 | |
835 Proxies may also be specified directly in Lisp. | |
836 | |
837 @defopt url-proxy-services | |
838 This variable is an alist of URL schemes and proxy servers that | |
839 gateway them. The items are of the form @w{@code{(@var{scheme} | |
840 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is | |
841 gatewayed through @var{portnumber} on the specified @var{host}. An | |
842 exception is the pseudo scheme @code{"no_proxy"}, which is paired with | |
843 a regexp matching host names not to be proxied. This variable is | |
844 initialized from the environment as above. | |
845 | |
846 @example | |
847 (setq url-proxy-services | |
848 '(("http" . "proxy.aventail.com:80") | |
849 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com"))) | |
850 @end example | |
851 @end defopt | |
852 | |
853 @node Gateways in general | |
854 @section Gateways in General | |
855 @cindex gateways | |
856 @cindex firewalls | |
857 | |
858 The library provides a general gateway layer through which all | |
859 networking passes. It can both control access to the network and | |
860 provide access through gateways in firewalls. This may make direct | |
861 connections in some cases and pass through some sort of gateway in | |
862 others.@footnote{Proxies (which only operate over HTTP) are | |
863 implemented using this.} The library's basic function responsible for | |
864 making connections is @code{url-open-stream}. | |
865 | |
866 @defun url-open-stream name buffer host service | |
867 @cindex opening a stream | |
868 @cindex stream, opening | |
869 Open a stream to @var{host}, possibly via a gateway. The other | |
870 arguments are as for @code{open-network-stream}. This will not make a | |
871 connection if @code{url-gateway-unplugged} is non-@code{nil}. | |
872 @end defun | |
873 | |
874 @defvar url-gateway-local-host-regexp | |
875 This is a regular expression that matches local hosts that do not | |
876 require the use of a gateway. If @code{nil}, all connections are made | |
877 through the gateway. | |
878 @end defvar | |
879 | |
880 @defvar url-gateway-method | |
881 This variable controls which gateway method is used. It may be useful | |
882 to bind it temporarily in some applications. It has values taken from | |
883 a list of symbols. Possible values are: | |
884 | |
885 @table @code | |
886 @item telnet | |
887 @cindex @command{telnet} | |
888 Use this method if you must first telnet and log into a gateway host, | |
889 and then run telnet from that host to connect to outside machines. | |
890 | |
891 @item rlogin | |
892 @cindex @command{rlogin} | |
893 This method is identical to @code{telnet}, but uses @command{rlogin} | |
894 to log into the remote machine without having to send the username and | |
895 password over the wire every time. | |
896 | |
897 @item socks | |
898 @cindex @sc{socks} | |
899 Use if the firewall has a @sc{socks} gateway running on it. The | |
900 @sc{socks} v5 protocol is defined in RFC 1928. | |
901 | |
902 @c @item ssl | |
903 @c This probably shouldn't be documented | |
904 @c Fixme: why not? -- fx | |
905 | |
906 @item native | |
907 This method uses Emacs's builtin networking directly. This is the | |
908 default. It can be used only if there is no firewall blocking access. | |
909 @end table | |
910 @end defvar | |
911 | |
912 The following variables control the gateway methods. | |
913 | |
914 @defopt url-gateway-telnet-host | |
915 The gateway host to telnet to. Once logged in there, you then telnet | |
916 out to the hosts you want to connect to. | |
917 @end defopt | |
918 @defopt url-gateway-telnet-parameters | |
919 This should be a list of parameters to pass to the @command{telnet} program. | |
920 @end defopt | |
921 @defopt url-gateway-telnet-password-prompt | |
922 This is a regular expression that matches the password prompt when | |
923 logging in. | |
924 @end defopt | |
925 @defopt url-gateway-telnet-login-prompt | |
926 This is a regular expression that matches the username prompt when | |
927 logging in. | |
928 @end defopt | |
929 @defopt url-gateway-telnet-user-name | |
930 The username to log in with. | |
931 @end defopt | |
932 @defopt url-gateway-telnet-password | |
933 The password to send when logging in. | |
934 @end defopt | |
935 @defopt url-gateway-prompt-pattern | |
936 This is a regular expression that matches the shell prompt. | |
937 @end defopt | |
938 | |
939 @defopt url-gateway-rlogin-host | |
940 Host to @samp{rlogin} to before telnetting out. | |
941 @end defopt | |
942 @defopt url-gateway-rlogin-parameters | |
943 Parameters to pass to @samp{rsh}. | |
944 @end defopt | |
945 @defopt url-gateway-rlogin-user-name | |
946 User name to use when logging in to the gateway. | |
947 @end defopt | |
948 @defopt url-gateway-prompt-pattern | |
949 This is a regular expression that matches the shell prompt. | |
950 @end defopt | |
951 | |
952 @defopt socks-server | |
953 This specifies the default server, it takes the form | |
954 @w{@code{("Default server" @var{server} @var{port} @var{version})}} | |
955 where @var{version} can be either 4 or 5. | |
956 @end defopt | |
957 @defvar socks-password | |
958 If this is @code{nil} then you will be asked for the password, | |
959 otherwise it will be used as the password for authenticating you to | |
960 the @sc{socks} server. | |
961 @end defvar | |
962 @defvar socks-username | |
963 This is the username to use when authenticating yourself to the | |
964 @sc{socks} server. By default this is your login name. | |
965 @end defvar | |
966 @defvar socks-timeout | |
967 This controls how long, in seconds, to wait for responses from the | |
968 @sc{socks} server; it is 5 by default. | |
969 @end defvar | |
970 @c fixme: these have been effectively commented-out in the code | |
971 @c @defopt socks-server-aliases | |
972 @c This a list of server aliases. It is a list of aliases of the form | |
973 @c @var{(alias hostname port version)}. | |
974 @c @end defopt | |
975 @c @defopt socks-network-aliases | |
976 @c This a list of network aliases. Each entry in the list takes the form | |
977 @c @var{(alias (network))} where @var{alias} is a string that names the | |
978 @c @var{network}. The networks can contain a pair (not a dotted pair) of | |
979 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip} | |
980 @c address and a netmask, a domain name or a unique hostname or @sc{ip} | |
981 @c address. | |
982 @c @end defopt | |
983 @c @defopt socks-redirection-rules | |
984 @c This a list of redirection rules. Each rule take the form | |
985 @c @var{(Destination network Connection type)} where @var{Destination | |
986 @c network} is a network alias from @code{socks-network-aliases} and | |
987 @c @var{Connection type} can be @code{nil} in which case a direct | |
988 @c connection is used, or it can be an alias from | |
989 @c @code{socks-server-aliases} in which case that server is used as a | |
990 @c proxy. | |
991 @c @end defopt | |
992 @defopt socks-nslookup-program | |
993 @cindex @command{nslookup} | |
994 This the @samp{nslookup} program. It is @code{"nslookup"} by default. | |
995 @end defopt | |
996 | |
997 @menu | |
998 * Suppressing network connections:: | |
999 @end menu | |
1000 @c * Broken hostname resolution:: | |
1001 | |
1002 @node Suppressing network connections | |
1003 @subsection Suppressing Network Connections | |
1004 | |
1005 @cindex network connections, suppressing | |
1006 @cindex suppressing network connections | |
1007 @cindex bugs, HTML | |
1008 @cindex HTML `bugs' | |
1009 In some circumstances it is desirable to suppress making network | |
1010 connections. A typical case is when rendering HTML in a mail user | |
1011 agent, when external URLs should not be activated, particularly to | |
1012 avoid `bugs' which `call home' by fetch single-pixel images and the | |
1013 like. To arrange this, bind the following variable for the duration | |
1014 of such processing. | |
1015 | |
1016 @defvar url-gateway-unplugged | |
1017 If this variable is non-@code{nil} new network connections are never | |
1018 opened by the URL library. | |
1019 @end defvar | |
1020 | |
1021 @c @node Broken hostname resolution | |
1022 @c @subsection Broken Hostname Resolution | |
1023 | |
1024 @c @cindex hostname resolver | |
1025 @c @cindex resolver, hostname | |
1026 @c Some C libraries do not include the hostname resolver routines in | |
1027 @c their static libraries. If Emacs was linked statically, and was not | |
1028 @c linked with the resolver libraries, it will not be able to get to any | |
1029 @c machines off the local network. This is characterized by being able | |
1030 @c to reach someplace with a raw ip number, but not its hostname | |
1031 @c (@url{http://129.79.254.191/} works, but | |
1032 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on | |
1033 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be | |
1034 @c rebuilt linked against the resolver library, it can use the external | |
1035 @c @command{nslookup} program instead. | |
1036 | |
1037 @c @defopt url-gateway-broken-resolution | |
1038 @c @cindex @code{nslookup} program | |
1039 @c @cindex program, @code{nslookup} | |
1040 @c If non-@code{nil}, this variable says to use the program specified by | |
1041 @c @code{url-gateway-nslookup-program} program to do hostname resolution. | |
1042 @c @end defopt | |
1043 | |
1044 @c @defopt url-gateway-nslookup-program | |
1045 @c The name of the program to do hostname lookup if Emacs can't do it | |
1046 @c directly. This program should expect a single argument on the command | |
1047 @c line---the hostname to resolve---and should produce output similar to | |
1048 @c the standard Unix @command{nslookup} program: | |
1049 @c @example | |
1050 @c Name: www.cs.indiana.edu | |
1051 @c Address: 129.79.254.191 | |
1052 @c @end example | |
1053 @c @end defopt | |
1054 | |
1055 @node History | |
1056 @section History | |
1057 | |
1058 @findex url-do-setup | |
1059 The library can maintain a global history list tracking URLs accessed. | |
1060 URL completion can be done from it. The history mechanism is set up | |
1061 automatically via @code{url-do-setup} when it is configured to be on. | |
1062 Note that the size of the history list is currently not limited. | |
1063 | |
1064 @vindex url-history-hash-table | |
1065 The history `list' is actually a hash table, | |
1066 @code{url-history-hash-table}. It contains access times keyed by URL | |
1067 strings. The times are in the format returned by @code{current-time}. | |
1068 | |
1069 @defun url-history-update-url url time | |
1070 This function updates the history table with an entry for @var{url} | |
1071 accessed at the given @var{time}. | |
1072 @end defun | |
1073 | |
1074 @defopt url-history-track | |
1075 If non-@code{nil}, the library will keep track of all the URLs | |
1076 accessed. If it is @code{t}, the list is saved to disk at the end of | |
1077 each Emacs session. The default is @code{nil}. | |
1078 @end defopt | |
1079 | |
1080 @defopt url-history-file | |
1081 The file storing the history list between sessions. It defaults to | |
1082 @file{history} in @code{url-configuration-directory}. | |
1083 @end defopt | |
1084 | |
1085 @defopt url-history-save-interval | |
1086 @findex url-history-setup-save-timer | |
1087 The number of seconds between automatic saves of the history list. | |
1088 Default is one hour. Note that if you change this variable directly, | |
1089 rather than using Custom, after @code{url-do-setup} has been run, you | |
1090 need to run the function @code{url-history-setup-save-timer}. | |
1091 @end defopt | |
1092 | |
1093 @defun url-history-parse-history &optional fname | |
1094 Parses the history file @var{fname} (default @code{url-history-file}) | |
1095 and sets up the history list. | |
1096 @end defun | |
1097 | |
1098 @defun url-history-save-history &optional fname | |
1099 Saves the current history to file @var{fname} (default | |
1100 @code{url-history-file}). | |
1101 @end defun | |
1102 | |
1103 @defun url-completion-function string predicate function | |
1104 You can use this function to do completion of URLs from the history. | |
1105 @end defun | |
1106 | |
1107 @node Customization | |
1108 @chapter Customization | |
1109 | |
1110 @section Environment Variables | |
1111 | |
1112 @cindex environment variables | |
1113 The following environment variables affect the library's operation at | |
1114 startup. | |
1115 | |
1116 @table @code | |
1117 @item TMPDIR | |
1118 @vindex TMPDIR | |
1119 @vindex url-temporary-directory | |
1120 If this is defined, @var{url-temporary-directory} is initialized from | |
1121 it. | |
1122 @end table | |
1123 | |
1124 @section General User Options | |
1125 | |
1126 The following user options, settable with Customize, affect the | |
1127 general operation of the package. | |
1128 | |
1129 @defopt url-debug | |
1130 @cindex debugging | |
1131 Specifies the types of debug messages the library which are logged to | |
1132 the @code{*URL-DEBUG*} buffer. | |
1133 @code{t} means log all messages. | |
1134 A number means log all messages and show them with @code{message}. | |
1135 If may also be a list of the types of messages to be logged. | |
1136 @end defopt | |
1137 @defopt url-personal-mail-address | |
1138 @end defopt | |
1139 @defopt url-privacy-level | |
1140 @end defopt | |
1141 @defopt url-uncompressor-alist | |
1142 @end defopt | |
1143 @defopt url-passwd-entry-func | |
1144 @end defopt | |
1145 @defopt url-standalone-mode | |
1146 @end defopt | |
1147 @defopt url-bad-port-list | |
1148 @end defopt | |
1149 @defopt url-max-password-attempts | |
1150 @end defopt | |
1151 @defopt url-temporary-directory | |
1152 @end defopt | |
1153 @defopt url-show-status | |
1154 @end defopt | |
1155 @defopt url-confirmation-func | |
1156 The function to use for asking yes or no functions. This is normally | |
1157 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another | |
1158 function taking a single argument (the prompt) and returning @code{t} | |
1159 only if an affirmative answer is given. | |
1160 @end defopt | |
1161 @defopt url-gateway-method | |
1162 @c fixme: describe gatewaying | |
1163 A symbol specifying the type of gateway support to use for connections | |
1164 from the local machine. The supported methods are: | |
1165 | |
1166 @table @code | |
1167 @item telnet | |
1168 Run telnet in a subprocess to connect; | |
1169 @item rlogin | |
1170 Rlogin to another machine to connect; | |
1171 @item socks | |
1172 Connect through a socks server; | |
1173 @item ssl | |
1174 Connect with SSL; | |
1175 @item native | |
1176 Connect directly. | |
1177 @end table | |
1178 @end defopt | |
1179 | |
1180 @node GNU Free Documentation License | |
1181 @appendix GNU Free Documentation License | |
1182 @include doclicense.texi | |
1183 | |
1184 @node Function Index | |
1185 @unnumbered Command and Function Index | |
1186 @printindex fn | |
1187 | |
1188 @node Variable Index | |
1189 @unnumbered Variable Index | |
1190 @printindex vr | |
1191 | |
1192 @node Concept Index | |
1193 @unnumbered Concept Index | |
1194 @printindex cp | |
1195 | |
1196 @setchapternewpage odd | |
1197 @contents | |
1198 @bye | |
1199 | |
1200 @ignore | |
1201 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0 | |
1202 @end ignore |