comparison man/url.texi @ 58830:27baac8434ba

url.texi: New file.
author Stefan Monnier <monnier@iro.umontreal.ca>
date Tue, 07 Dec 2004 16:55:48 +0000
parents
children d97ebd9e30f6
comparison
equal deleted inserted replaced
58829:bf43c774d02c 58830:27baac8434ba
1 \input texinfo
2 @setfilename url.info
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory World Wide Web
16 @dircategory GNU Emacs Lisp
17 @direntry
18 * URL: (url). URL loading package.
19 @end direntry
20
21 @ifnottex
22 This file documents the URL loading package.
23
24 Copyright (C) 1996, 1997, 1998, 1999, 2002, 2004 Free Software Foundation
25 Copyright (C) 1993, 1994, 1995, 1996 William M. Perry
26
27 Permission is granted to copy, distribute and/or modify this document
28 under the terms of the GNU Free Documentation License, Version 1.1 or
29 any later version published by the Free Software Foundation; with the
30 Invariant Sections being
31 ``GNU GENERAL PUBLIC LICENSE''. A copy of the
32 license is included in the section entitled ``GNU Free Documentation
33 License.''
34 @end ifnottex
35
36 @c
37 @titlepage
38 @sp 6
39 @center @titlefont{URL}
40 @center @titlefont{Programmer's Manual}
41 @sp 4
42 @center First Edition, URL Version 2.0
43 @sp 1
44 @c @center December 1999
45 @sp 5
46 @center William M. Perry
47 @center @email{wmperry@@gnu.org}
48 @center David Love
49 @center @email{fx@@gnu.org}
50 @page
51 @vskip 0pt plus 1filll
52 Copyright @copyright{} 1993, 1994, 1995, 1996 William M. Perry@*
53 Copyright @copyright{} 1996, 1997, 1998, 1999, 2002 Free Software Foundation
54
55 Permission is granted to copy, distribute and/or modify this document
56 under the terms of the GNU Free Documentation License, Version 1.1 or
57 any later version published by the Free Software Foundation; with the
58 Invariant Sections being
59 ``GNU GENERAL PUBLIC LICENSE''. A copy of the
60 license is included in the section entitled ``GNU Free Documentation
61 License.''
62 @end titlepage
63 @page
64 @node Top
65 @top URL
66
67
68
69 @menu
70 * Getting Started:: Preparing your program to use URLs.
71 * Retrieving URLs:: How to use this package to retrieve a URL.
72 * Supported URL Types:: Descriptions of URL types currently supported.
73 * Defining New URLs:: How to define a URL loader for a new protocol.
74 * General Facilities:: URLs can be cached, accessed via a gateway
75 and tracked in a history list.
76 * Customization:: Variables you can alter.
77 * Function Index::
78 * Variable Index::
79 * Concept Index::
80 @end menu
81
82 @node Getting Started
83 @chapter Getting Started
84 @cindex URLs, definition
85 @cindex URIs
86
87 @dfn{Uniform Resource Locators} (URLs) are a specific form of
88 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
89 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
90 agents.
91
92 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
93 @var{scheme}s supported by this library are described below.
94 @xref{Supported URL Types}.
95
96 FTP NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
97 IRC and gopher URLs all have the form
98
99 @example
100 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
101 @end example
102 @noindent
103 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
104 @var{userinfo} sometimes takes the form @var{username}:@var{password}
105 but you should beware of the security risks of sending cleartext
106 passwords. @var{hostname} may be a domain name or a dotted decimal
107 address. If the @samp{:@var{port}} is omitted then the library will
108 use the `well known' port for that service when accessing URLs. With
109 the possible exception of @code{telnet}, it is rare for ports to be
110 specified, and it is possible using a non-standard port may have
111 undesired consequences if a different service is listening on that
112 port (e.g.@: an HTTP URL specifying the SMTP port can cause mail to be
113 sent).@c , but @xref{Other Variables, url-bad-port-list}.
114 The meaning of
115 the @var{path} component depends on the service.
116
117 The library depends on MIME support provided by the @samp{mm-}
118 packages from Gnus 5.8 or later. @xref{(emacs-mime)Top, The MIME
119 library}.
120
121 @menu
122 * Configuration::
123 * Parsed URLs:: URLs are parsed into vector structures.
124 @end menu
125
126 @node Configuration
127 @section Configuration
128
129 @defvar url-configuration-directory
130 @cindex @file{~/.url}
131 @cindex configuration files
132 The directory in which URL configuration files, the cache etc.,
133 reside. Default @file{~/.url}.
134 @end defvar
135
136 @node Parsed URLs
137 @section Parsed URLs
138 @cindex parsed URLs
139 The library functions typically operate on @dfn{parsed} versions of
140 URLs. These are actually vectors of the form:
141
142 @example
143 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
144 @end example
145
146 @noindent where
147 @table @var
148 @item type
149 is the type of the URL scheme, e.g.@: @code{http}
150 @item user
151 is the username associated with it, or @code{nil};
152 @item password
153 is the user password associated with it, or @code{nil};
154 @item host
155 is the host name associated with it, or @code{nil};
156 @item port
157 is the port number associated with it, or @code{nil};
158 @item file
159 is the `file' part of it, or @code{nil}. This doesn't necessarily
160 actually refer to a file;
161 @item target
162 is the target part, or @code{nil};
163 @item attributes
164 is the attributes associated with it, or @code{nil};
165 @item full
166 is @code{t} for a fully-specified URL, with a host part indicated by
167 @samp{//} after the scheme part.
168 @end table
169
170 @findex url-type
171 @findex url-user
172 @findex url-password
173 @findex url-host
174 @findex url-port
175 @findex url-file
176 @findex url-target
177 @findex url-attributes
178 @findex url-full
179 @findex url-set-type
180 @findex url-set-user
181 @findex url-set-password
182 @findex url-set-host
183 @findex url-set-port
184 @findex url-set-file
185 @findex url-set-target
186 @findex url-set-attributes
187 @findex url-set-full
188 These attributes have accessors named @code{url-@var{part}}, where
189 @var{part} is the name of one of the elements above, e.g.@:
190 @code{url-host}. Similarly, there are setters of the form
191 @code{url-set-@var{part}}.
192
193 There are functions for parsing and unparsing between the string and
194 vector forms.
195
196 @defun url-generic-parse-url url
197 Return a parsed version of the string @var{url}.
198 @end defun
199
200 @defun url-recreate-url url
201 @cindex unparsing URLs
202 Recreates a URL string from the parsed @var{url}.
203 @end defun
204
205 @node Retrieving URLs
206 @chapter Retrieving URLs
207
208 @defun url-retrieve-synchronously url
209 Retrieve @var{url} synchronously and return a buffer containing the
210 data. @var{url} is either a string or a parsed URL structure. Return
211 @var{nil} if there are no data associated with it (the case for dired,
212 info, or mailto URLs that need no further processing).
213 @end defun
214
215 @defun url-retrieve url callback &optional cbargs
216 Retrieve @var{url} asynchronously and call @var{callback} with args
217 @var{cbargs} when finished. The callback is called when the object
218 has been completely retrieved, with the current buffer containing the
219 object and any MIME headers associated with it. @var{url} is either a
220 string or a parsed URL structure. Returns the buffer @var{url} will
221 load into, or @var{nil} if the process has already completed.
222 @end defun
223
224 @node Supported URL Types
225 @chapter Supported URL Types
226
227 @menu
228 * http/https:: Hypertext Transfer Protocol.
229 * file/ftp:: Local files and FTP archives.
230 * info:: Emacs `Info' pages.
231 * mailto:: Sending email.
232 * news/nntp/snews:: Usenet news.
233 * rlogin/telnet/tn3270:: Remote host connectivity.
234 * irc:: Internet Relay Chat.
235 * data:: Embedded data URLs.
236 * nfs:: Networked File System
237 @c * finger::
238 @c * gopher::
239 @c * netrek::
240 @c * prospero::
241 * cid:: Content-ID.
242 * about::
243 * ldap:: Lightweight Directory Access Protocol
244 * imap:: IMAP mailboxes.
245 * man:: Unix man pages.
246 @end menu
247
248 @node http/https
249 @section @code{http} and @code{https}
250
251 The scheme @code{http} is Hypertext Transfer Protocol. The library
252 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
253 defined in RFC 1945) HTTP URLs have the following form, where most of
254 the parts are optional:
255 @example
256 http://@var{user}:@var{password}@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
257 @end example
258 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
259 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
260 @c series elements. The @code{?@var{searchpart}}, if present, is the
261 @c query for a search or the content of a form submission. The
262 @c @code{#fragment} part, if present, is a location in the document.
263
264 The scheme @code{https} is a secure version of @code{http}, with
265 transmission via SSL. It is defined in RFC 2069. Its default port is
266 443. This scheme depends on SSL support in Emacs via the
267 @file{ssl.el} library and is actually implemented by forcing the
268 @code{ssl} gateway method to be used. @xref{Gateways in general}.
269
270 @defopt url-honor-refresh-requests
271 This controls honouring of HTTP @samp{Refresh} headers by which
272 servers can direct clients to reload documents from the same URL or a
273 or different one. @code{nil} means they will not be honoured,
274 @code{t} (the default) means they will always be honoured, and
275 otherwise the user will be asked on each request.
276 @end defopt
277
278
279 @menu
280 * Cookies::
281 * HTTP language/coding::
282 * HTTP URL Options::
283 * Dealing with HTTP documents::
284 @end menu
285
286 @node Cookies
287 @subsection Cookies
288
289 @defopt url-cookie-file
290 The file in which cookies are stored, defaulting to @file{cookies} in
291 the directory specified by @code{url-configuration-directory}.
292 @end defopt
293
294 @defopt url-cookie-confirmation
295 Specifies whether confirmation is require to accept cookies.
296 @end defopt
297
298 @defopt url-cookie-multiple-line
299 Specifies whether to put all cookies for the server on one line in the
300 HTTP request to satisfy broken servers like
301 @url{http://www.hotmail.com}.
302 @end defopt
303
304 @defopt url-cookie-trusted-urls
305 A list of regular expressions matching URLs from which to accept
306 cookies always.
307 @end defopt
308
309 @defopt url-cookie-untrusted-urls
310 A list of regular expressions matching URLs from which to reject
311 cookies always.
312 @end defopt
313
314 @defopt url-cookie-save-interval
315 The number of seconds between automatic saves of cookies to disk.
316 Default is one hour.
317 @end defopt
318
319
320 @node HTTP language/coding
321 @subsection Language and Encoding Preferences
322
323 HTTP allows clients to express preferences for the language and
324 encoding of documents which servers may honour.
325
326 @defopt url-mime-charset-string
327 @cindex character sets
328 @cindex coding systems
329 This variable specifies a preference for character sets when documents
330 can be served in more than one encoding.
331
332 HTTP allows specifying a list of MIME charsets which indicate your
333 preferred character set encodings, e.g.@: Latin-9 or Big5, and these
334 can be weighted. In Emacs 21 this list is generated automatically
335 from the list of defined coding systems which have associated MIME
336 types. These are sorted by coding priority. @xref{Recognize Coding,
337 , Recognizing Coding Systems, emacs, GNU Emacs Manual}.
338 @end defopt
339
340 @defopt url-mime-language-string
341 @cindex language preferences
342 A string specifying the preferred language when servers can serve
343 files in several languages. Use RFC 1766 abbreviations, e.g.@:
344 @samp{en} for English, @samp{de} for German. It can be a
345 comma-separated list in descending order of preference. The ordering
346 can be made explicit using `q' factors defined by HTTP, e.g.@:
347 @w{@samp{de, en-gb;q=0.8, en;q=0.7}}. It can be @samp{*} to get the
348 first available language (as opposed to the default).
349 @end defopt
350
351 @node HTTP URL Options
352 @subsection HTTP URL Options
353
354 HTTP supports an @samp{OPTIONS} method describing things supported by
355 the URL@.
356
357 @defun url-http-options url
358 Returns a property list describing options available for URL. The
359 property list members are:
360
361 @table @code
362 @item methods
363 A list of symbols specifying what HTTP methods the resource
364 supports.
365
366 @item dav
367 @cindex DAV
368 A list of numbers specifying what DAV protocol/schema versions are
369 supported.
370
371 @item dasl
372 @cindex DASL
373 A list of supported DASL search types supported (string form).
374
375 @item ranges
376 A list of the units available for use in partial document fetches.
377
378 @item p3p
379 @cindex P3P
380 The @dfn{Platform For Privacy Protection} description for the resource.
381 Currently this is just the raw header contents.
382 @end table
383
384 @end defun
385
386 @node Dealing with HTTP documents
387 @subsection Dealing with HTTP documents
388
389 HTTP URLs are retrieved into a buffer containing the HTTP headers
390 followed by the body. Since the headers are quasi-MIME, they may be
391 processed using the MIME library. @xref{(emacs-mime)Top, The MIME
392 library}. The MIME library doesn't provide a clean function to do
393 that, so the URL library does.
394
395 @defun url-decode-text-part handle &optional coding
396 This function decodes charset-encoded text in the current buffer. In
397 Emacs, the buffer is expected to be unibyte initially and is set to
398 multibyte after decoding.
399 HANDLE is the MIME handle of the original part. CODING is an explicit
400 coding to use, overriding what the MIME headers specify.
401 The coding system used for the decoding is returned.
402
403 Note that this function doesn't deal with @samp{http-equiv} charset
404 specifications in HTML @samp{<meta>} elements.
405 @end defun
406
407 @node file/ftp
408 @section file and ftp
409 @cindex files
410 @cindex FTP
411 @cindex File Transfer Protocol
412 @cindex compressed files
413 @findex dired
414
415 @example
416 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
417 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
418 @end example
419
420 These schemes are defined in RFC 1808.
421 @samp{ftp:} and @samp{file:} are synonomous in this library. They
422 allow reading arbitary files from hosts. Either @samp{ange-ftp}
423 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
424 hosts. Local files are accessed directly.
425
426 Compressed files are handled, but support is hard-coded so that
427 @code{jka-compr-compression-info-list} and so on have no affect.
428 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
429 @samp{.bz2}.
430
431 @defopt url-directory-index-file
432 The filename to look for when indexing a directory, default
433 @samp{"index.html"}. If this file exists, and is readable, then it
434 will be viewed instead of using @code{dired} to view the directory.
435 @end defopt
436
437 @node info
438 @section info
439 @cindex Info
440 @cindex Texinfo
441 @findex Info-goto-node
442
443 @example
444 info:@var{file}#@var{node}
445 @end example
446
447 Info URLs are not officially defined. They invoke
448 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
449 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
450
451 @node mailto
452 @section mailto
453
454 @cindex mailto
455 @cindex email
456 A mailto URL will send an email message to the address in the
457 URL, for example @samp{mailto:foo@@bar.com} would compose a
458 message to @samp{foo@@bar.com}.
459
460 @defopt url-mail-command
461 @vindex mail-user-agent
462 The function called whenever url needs to send mail. This should
463 normally be left to default from @var{mail-user-agent}. @xref{Mail
464 Methods, , Mail-Composition Methods, emacs, GNU Emacs Manual}.
465 @end defopt
466
467 An @samp{X-Url-From} header field containing the URL of the document
468 that contained the mailto URL is added if that URL is known.
469
470 RFC 2368 extends the definition of mailto URLs in RFC 1738.
471 The form of a mailto URL is
472 @example
473 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
474 @end example
475 @noindent where an arbitary number of @var{header}s can be added. If the
476 @var{header} is @samp{body}, then @var{contents} is put in the body
477 otherwise a @var{header} header field is created with @var{contents}
478 as its contents. Note that the URL library does not consider any
479 headers `dangerous' so you should check them before sending the
480 message.
481
482 @c Fixme: update
483 Email messages are defined in @sc{rfc}822.
484
485 @node news/nntp/snews
486 @section @code{news}, @code{nntp} and @code{snews}
487 @cindex news
488 @cindex network news
489 @cindex usenet
490 @cindex NNTP
491 @cindex snews
492
493 @c draft-gilman-news-url-01
494 The network news URL scheme take the following forms following RFC
495 1738 except that for compatibility with other clients, host and port
496 fields may be included in news URLs though they are properly only
497 allowed for nntp an snews.
498
499 @table @samp
500 @item news:@var{newsgroup}
501 Retrieves a list of messages in @var{newsgroup};
502 @item news:@var{message-id}
503 Retrieves the message with the given @var{message-id};
504 @item news:*
505 Retrieves a list of all available newsgroups;
506 @item nntp://@var{host}:@var{port}/@var{newsgroup}
507 @itemx nntp://@var{host}:@var{port}/@var{message-id}
508 @itemx nntp://@var{host}:@var{port}/*
509 Similar to the @samp{news} versions.
510 @end table
511
512 @samp{:@var{port}} is optional and defaults to :119.
513
514 @samp{snews} is the same as @samp{nntp} except that the default port
515 is :563.
516 @cindex SSL
517 (It is tunnelled through SSL.)
518
519 An @samp{nntp} URL is the same as a news URL, except that the URL may
520 specify an article by its number.
521
522 @defopt url-news-server
523 This variable can be used to override the default news server.
524 Usually this will be set by the Gnus package, which is used to fetch
525 news.
526 @cindex environment variable
527 @vindex NNTPSERVER
528 It may be set from the conventional environment variable
529 @code{NNTPSERVER}.
530 @end defopt
531
532 @node rlogin/telnet/tn3270
533 @section rlogin, telnet and tn3270
534 @cindex rlogin
535 @cindex telnet
536 @cindex tn3270
537 @cindex terminal emulation
538 @findex terminal-emulator
539
540 These URL schemes from RFC 1738 for logon via a terminal emulator have
541 the form
542 @example
543 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
544 @end example
545 but the @code{:@var{password}} component is ignored.
546
547 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
548 @code{telnet} or @code{tn3270} (the program names and arguments are
549 hardcoded) session is run in a @code{terminal-emulator} buffer.
550 Well-known ports are used if the URL does not specify a port.
551
552 @node irc
553 @section irc
554 @cindex IRC
555 @cindex Internet Relay Chat
556 @cindex ZEN IRC
557 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
558 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
559 session to a function named in @code{url-irc-function}.
560
561 @defopt url-irc-function
562 A function to actually open an IRC connection.
563 This function
564 must take five arguments, @var{host}, @var{port}, @var{channel},
565 @var{user} and @var{password}. The @var{channel} argument specifies the
566 channel to join immediately, this can be @code{nil}. By default this is
567 @code{url-irc-zenirc}.
568 @end defopt
569 @defun url-irc-zenirc host port channel user password
570 Processes the arguments and lets @code{zenirc} handle the session.
571 @end defun
572
573 @node data
574 @section data
575 @cindex data URLs
576
577 @example
578 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
579 @end example
580
581 Data URLs contain MIME data in the URL itself. They are defined in
582 RFC 2397.
583
584 @var{media-type} is a MIME @samp{Content-Type} string, possibly
585 including parameters. It defaults to
586 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
587 omitted but the charset parameter supplied. If @samp{;base64} is
588 present, the @var{data} are base64-encoded.
589
590 @node nfs
591 @section nfs
592 @cindex NFS
593 @cindex Network File System
594 @cindex automounter
595
596 @example
597 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
598 @end example
599
600 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
601 @samp{ftp:} except that it points to a file on a remote host that is
602 handled by the automounter on the local host.
603
604 @defvar url-nfs-automounter-directory-spec
605 @end defvar
606 A string saying how to invoke the NFS automounter. Certain @samp{%}
607 sequences are recognized:
608
609 @table @samp
610 @item %h
611 The hostname of the NFS server;
612 @item %n
613 The port number of the NFS server;
614 @item %u
615 The username to use to authenticate;
616 @item %p
617 The password to use to authenticate;
618 @item %f
619 The filename on the remote server;
620 @item %%
621 A literal @samp{%}.
622 @end table
623
624 Each can be used any number of times.
625
626 @node cid
627 @section cid
628 @cindex Content-ID
629
630 RFC 2111
631
632 @node about
633 @section about
634
635 @node ldap
636 @section ldap
637 @cindex LDAP
638 @cindex Lightweight Directory Access Protocol
639
640 The LDAP scheme is defined in RFC 2255.
641
642 @node imap
643 @section imap
644 @cindex IMAP
645
646 RFC 2192
647
648 @node man
649 @section man
650 @cindex @command{man}
651 @cindex Unix man pages
652 @findex man
653
654 @example
655 @samp{man:@var{page-spec}}
656 @end example
657
658 This is a non-standard scheme. @var{page-spec} is passed directly to
659 the Lisp @code{man} function.
660
661 @node Defining New URLs
662 @chapter Defining New URLs
663
664 @menu
665 * Naming conventions::
666 * Required functions::
667 * Optional functions::
668 * Asynchronous fetching::
669 * Supporting file-name-handlers::
670 @end menu
671
672 @node Naming conventions
673 @section Naming conventions
674
675 @node Required functions
676 @section Required functions
677
678 @node Optional functions
679 @section Optional functions
680
681 @node Asynchronous fetching
682 @section Asynchronous fetching
683
684 @node Supporting file-name-handlers
685 @section Supporting file-name-handlers
686
687 @node General Facilities
688 @chapter General Facilities
689
690 @menu
691 * Disk Caching::
692 * Proxies::
693 * Gateways in general::
694 * History::
695 @end menu
696
697 @node Disk Caching
698 @section Disk Caching
699 @cindex Caching
700 @cindex Persistent Cache
701 @cindex Disk Cache
702
703 The disk cache stores retrieved documents locally, whence they can be
704 retrieved more quickly. When requesting a URL that is in the cache,
705 the library checks to see if the page has changed since it was last
706 retrieved from the remote machine. If not, the local copy is used,
707 saving the transmission over the network.
708 @cindex Cleaning the cache
709 @cindex Clearing the cache
710 @cindex Cache cleaning
711 Currently the cache isn't cleared automatically.
712 @c Running the @code{clean-cache} shell script
713 @c fist is recommended, to allow for future cleaning of the cache. This
714 @c shell script will remove all files that have not been accessed since it
715 @c was last run. To keep the cache pared down, it is recommended that this
716 @c script be run from @i{at} or @i{cron} (see the manual pages for
717 @c crontab(5) or at(1) for more information)
718
719 @defopt url-automatic-caching
720 Setting this variable non-@code{nil} causes documents to be cached
721 automatically.
722 @end defopt
723
724 @defopt url-cache-directory
725 This variable specifies the
726 directory to store the cache files. It defaults to sub-directory
727 @file{cache} of @code{url-configuration-directory}.
728 @end defopt
729
730 @c Fixme: function v. option, but neither used.
731 @c @findex url-cache-expired
732 @c @defopt url-cache-expired
733 @c This is a function to decide whether or not a cache entry has expired.
734 @c It takes two times as it parameters and returns non-@code{nil} if the
735 @c second time is ``too old'' when compared with the first time.
736 @c @end defopt
737
738 @defopt url-cache-creation-function
739 The cache relies on a scheme for mapping URLs to files in the cache.
740 This variable names a function which sets the type of cache to use.
741 It takes a URL as argument and returns the absolute file name of the
742 corresponding cache file. The two supplied possibilities are
743 @code{url-cache-create-filename-using-md5} and
744 @code{url-cache-create-filename-human-readable}.
745 @end defopt
746
747 @defun url-cache-create-filename-using-md5 url
748 Creates a cache file name from @var{url} using MD5 hashing.
749 @findex md5
750 This is creates entries with very few cache collisions and is fast if
751 you have the @code{md5} function as a primitive (Emacs 21 and XEmacs).
752 @smallexample
753 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
754 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
755 @end smallexample
756 @end defun
757
758 @defun url-cache-create-filename-human-readable url
759 Creates a cache file name from @var{url} more obviously connected to
760 @var{url} than for @code{url-cache-create-filename-using-md5}, but
761 more likely to conflict with other files.
762 @smallexample
763 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
764 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
765 @end smallexample
766 @end defun
767
768 @c Fixme: never actually used currently?
769 @c @defopt url-standalone-mode
770 @c @cindex Relying on cache
771 @c @cindex Cache only mode
772 @c @cindex Standalone mode
773 @c If this variable is non-@code{nil}, the library relies solely on the
774 @c cache for fetching documents and avoids checking if they have changed
775 @c on remote servers.
776 @c @end defopt
777
778 @c With a large cache of documents on the local disk, it can be very handy
779 @c when traveling, or any other time the network connection is not active
780 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
781 @c solely on its cache, and avoid checking to see if the page has changed
782 @c on the remote server. In the case of a dial-on-demand PPP connection,
783 @c this will keep the phone line free as long as possible, only bringing up
784 @c the PPP connection when asking for a page that is not located in the
785 @c cache. This is very useful for demonstrations as well.
786
787 @node Proxies
788 @section Proxies and Gatewaying
789
790 @c fixme: check/document url-ns stuff
791 @cindex proxy servers
792 @cindex proxies
793 @cindex environment variables
794 @vindex HTTP_PROXY
795 Proxy servers are commonly used to provide gateways through firewalls
796 or as caches serving some more-or-less local network. Each protocol
797 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
798 conventionally configured commonly amongst different programs through
799 environment variables of the form @code{@var{protocol}_proxy}, where
800 @var{protocol} is one of the supported network protocols (@code{http},
801 @code{ftp} etc.). The library recognizes such variables in either
802 upper or lower case. Their values are of one of the forms:
803 @itemize @bullet
804 @item @code{@var{host}:@var{port}}
805 @item A full URL;
806 @item Simply a host name.
807 @end itemize
808
809 @vindex NO_PROXY
810 The @code{NO_PROXY} environment variable specifies URLs that should be
811 excluded from proxying (on servers that should be contacted directly).
812 This should be a comma-separated list of hostnames, domain names, or a
813 mixture of both. Asterisks can be used as wildcards, but other
814 clients may not support that. Domain names may be indicated by a
815 leading dot. For example:
816 @example
817 NO_PROXY="*.aventail.com,home.com,.seanet.com"
818 @end example
819 @noindent says to contact all machines in the @samp{aventail.com} and
820 @samp{seanet.com} domains directly, as well as the machine named
821 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
822 and @code{no_proxy} are also tried, in that order.
823
824 Proxies may also be specified directly in Lisp.
825
826 @defopt url-proxy-services
827 This variable is an alist of URL schemes and proxy servers that
828 gateway them. The items are of the form @w{@code{(@var{scheme}
829 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
830 gatewayed through @var{portnumber} on the specified @var{host}. An
831 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
832 a regexp matching host names not to be proxied. This variable is
833 initialized from the environment as above.
834
835 @example
836 (setq url-proxy-services
837 '(("http" . "proxy.aventail.com:80")
838 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
839 @end example
840 @end defopt
841
842 @node Gateways in general
843 @section Gateways in General
844 @cindex gateways
845 @cindex firewalls
846
847 The library provides a general gateway layer through which all
848 networking passes. It can both control access to the network and
849 provide access through gateways in firewalls. This may make direct
850 connexions in some cases and pass through some sort of gateway in
851 others.@footnote{Proxies (which only operate over HTTP) are
852 implemented using this.} The library's basic function responsible for
853 making connexions is @code{url-open-stream}.
854
855 @defun url-open-stream name buffer host service
856 @cindex opening a stream
857 @cindex stream, opening
858 Open a stream to @var{host}, possibly via a gateway. The other
859 arguments are as for @code{open-network-stream}. This will not make a
860 connexion if @code{url-gateway-unplugged} is non-@code{nil}.
861 @end defun
862
863 @defvar url-gateway-local-host-regexp
864 This is a regular expression that matches local hosts that do not
865 require the use of a gateway. If @code{nil}, all connexions are made
866 through the gateway.
867 @end defvar
868
869 @defvar url-gateway-method
870 This variable controls which gateway method is used. It may be useful
871 to bind it temporarily in some applications. It has values taken from
872 a list of symbols. Possible values are:
873
874 @table @code
875 @item telnet
876 @cindex @command{telnet}
877 Use this method if you must first telnet and log into a gateway host,
878 and then run telnet from that host to connect to outside machines.
879
880 @item rlogin
881 @cindex @command{rlogin}
882 This method is identical to @code{telnet}, but uses @command{rlogin}
883 to log into the remote machine without having to send the username and
884 password over the wire every time.
885
886 @item socks
887 @cindex @sc{socks}
888 Use if the firewall has a @sc{socks} gateway running on it. The
889 @sc{socks} v5 protocol is defined in RFC 1928.
890
891 @c @item ssl
892 @c This probably shouldn't be documented
893 @c Fixme: why not? -- fx
894
895 @item native
896 This method uses Emacs's builtin networking directly. This is the
897 default. It can be used only if there is no firewall blocking access.
898 @end table
899 @end defvar
900
901 The following variables control the gateway methods.
902
903 @defopt url-gateway-telnet-host
904 The gateway host to telnet to. Once logged in there, you then telnet
905 out to the hosts you want to connect to.
906 @end defopt
907 @defopt url-gateway-telnet-parameters
908 This should be a list of parameters to pass to the @command{telnet} program.
909 @end defopt
910 @defopt url-gateway-telnet-password-prompt
911 This is a regular expression that matches the password prompt when
912 logging in.
913 @end defopt
914 @defopt url-gateway-telnet-login-prompt
915 This is a regular expression that matches the username prompt when
916 logging in.
917 @end defopt
918 @defopt url-gateway-telnet-user-name
919 The username to log in with.
920 @end defopt
921 @defopt url-gateway-telnet-password
922 The password to send when logging in.
923 @end defopt
924 @defopt url-gateway-prompt-pattern
925 This is a regular expression that matches the shell prompt.
926 @end defopt
927
928 @defopt url-gateway-rlogin-host
929 Host to @samp{rlogin} to before telnetting out.
930 @end defopt
931 @defopt url-gateway-rlogin-parameters
932 Parametres to pass to @samp{rsh}.
933 @end defopt
934 @defopt url-gateway-rlogin-user-name
935 User name to use when logging in to the gateway.
936 @end defopt
937 @defopt url-gateway-prompt-pattern
938 This is a regular expression that matches the shell prompt.
939 @end defopt
940
941 @defopt socks-server
942 This specifies the default server, it takes the form
943 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
944 where @var{version} can be either 4 or 5.
945 @end defopt
946 @defvar socks-password
947 If this is @code{nil} then you will be asked for the passward,
948 otherwise it will be used as the password for authenticating you to
949 the @sc{socks} server.
950 @end defvar
951 @defvar socks-username
952 This is the username to use when authenticating yourself to the
953 @sc{socks} server. By default this is your login name.
954 @end defvar
955 @defvar socks-timeout
956 This controls how long, in seconds, to wait for responses from the
957 @sc{socks} server; it is 5 by default.
958 @end defvar
959 @c fixme: these have been effectively commented-out in the code
960 @c @defopt socks-server-aliases
961 @c This a list of server aliases. It is a list of aliases of the form
962 @c @var{(alias hostname port version)}.
963 @c @end defopt
964 @c @defopt socks-network-aliases
965 @c This a list of network aliases. Each entry in the list takes the form
966 @c @var{(alias (network))} where @var{alias} is a string that names the
967 @c @var{network}. The networks can contain a pair (not a dotted pair) of
968 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
969 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
970 @c address.
971 @c @end defopt
972 @c @defopt socks-redirection-rules
973 @c This a list of redirection rules. Each rule take the form
974 @c @var{(Destination network Connection type)} where @var{Destination
975 @c network} is a network alias from @code{socks-network-aliases} and
976 @c @var{Connection type} can be @code{nil} in which case a direct
977 @c connection is used, or it can be an alias from
978 @c @code{socks-server-aliases} in which case that server is used as a
979 @c proxy.
980 @c @end defopt
981 @defopt socks-nslookup-program
982 @cindex @command{nslookup}
983 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
984 @end defopt
985
986 @menu
987 * Suppressing network connexions::
988 @end menu
989 @c * Broken hostname resolution::
990
991 @node Suppressing network connexions
992 @subsection Suppressing Network Connexions
993
994 @cindex network connexions, suppressing
995 @cindex suppressing network connexions
996 @cindex bugs, HTML
997 @cindex HTML `bugs'
998 In some circumstances it is desirable to suppress making network
999 connexions. A typical case is when rendering HTML in a mail user
1000 agent, when external URLs should not be activated, particularly to
1001 avoid `bugs' which `call home' by fetch single-pixel images and the
1002 like. To arrange this, bind the following variable for the duration
1003 of such processing.
1004
1005 @defvar url-gateway-unplugged
1006 If this variable is non-@code{nil} new network connexions are never
1007 opened by the URL library.
1008 @end defvar
1009
1010 @c @node Broken hostname resolution
1011 @c @subsection Broken Hostname Resolution
1012
1013 @c @cindex hostname resolver
1014 @c @cindex resolver, hostname
1015 @c Some C libraries do not include the hostname resolver routines in
1016 @c their static libraries. If Emacs was linked statically, and was not
1017 @c linked with the resolver libraries, it wil not be able to get to any
1018 @c machines off the local network. This is characterized by being able
1019 @c to reach someplace with a raw ip number, but not its hostname
1020 @c (@url{http://129.79.254.191/} works, but
1021 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1022 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1023 @c rebuilt linked against the resolver library, it can use the external
1024 @c @command{nslookup} program instead.
1025
1026 @c @defopt url-gateway-broken-resolution
1027 @c @cindex @code{nslookup} program
1028 @c @cindex program, @code{nslookup}
1029 @c If non-@code{nil}, this variable says to use the program specified by
1030 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1031 @c @end defopt
1032
1033 @c @defopt url-gateway-nslookup-program
1034 @c The name of the program to do hostname lookup if Emacs can't do it
1035 @c directly. This program should expect a single argument on the command
1036 @c line---the hostname to resolve---and should produce output similar to
1037 @c the standard Unix @command{nslookup} program:
1038 @c @example
1039 @c Name: www.cs.indiana.edu
1040 @c Address: 129.79.254.191
1041 @c @end example
1042 @c @end defopt
1043
1044 @node History
1045 @section History
1046
1047 The library can maintain a global history list tracking URLs accessed.
1048 URL completion can be done from it. The history mechanism is set up
1049 @findex url-do-setup
1050 automatically via @code{url-do-setup} when it is configured to be on.
1051 Note that the size of the history list is currently not limited.
1052
1053 @vindex url-history-hash-table
1054 The history `list' is actually a hash table,
1055 @code{url-history-hash-table}. It contains access times keyed by URL
1056 strings. The times are in the format returned by @code{current-time}.
1057
1058 @defun url-history-update-url url time
1059 This function updates the hsitory table with an entry for @var{url}
1060 accessed at the gievn @var{time}.
1061 @end defun
1062
1063 @defopt url-history-track
1064 If non-@code{nil}, the library will keep track of all the URLs
1065 accessed. If is is @code{t}, the list is saved to disk at the end of
1066 each Emacs session. The default is @code{nil}.
1067 @end defopt
1068
1069 @defopt url-history-file
1070 The file storing the history list between sessions. It defaults to
1071 @file{history} in @code{url-configuration-directory}.
1072 @end defopt
1073
1074 @defopt url-history-save-interval
1075 @findex url-history-setup-save-timer
1076 The number of seconds between automatic saves of the history list.
1077 Default is one hour. Note that if you change this variable directly,
1078 rather than using Custom, after @code{url-do-setup} has been run, you
1079 need to run the function @code{url-history-setup-save-timer}.
1080 @end defopt
1081
1082 @defun url-history-parse-history &optional fname
1083 Parses the history file @var{fname} (default @code{url-history-file})
1084 and sets up the history list.
1085 @end defun
1086
1087 @defun url-history-save-history &optional fname
1088 Saves the current history to file @var{fname} (default
1089 @code{url-history-file}).
1090 @end defun
1091
1092 @defun url-completion-function string predicate function
1093 You can use this function to do completion of URLs from the history.
1094 @end defun
1095
1096 @node Customization
1097 @chapter Customization
1098
1099 @section Environment Variables
1100
1101 @cindex environment variables
1102 The following environment variables affect the library's operation at
1103 startup.
1104
1105 @table @code
1106 @item TMPDIR
1107 @vindex TMPDIR
1108 @vindex url-temporary-directory
1109 If this is defined, @var{url-temporary-directory} is initialized from
1110 it.
1111 @end table
1112
1113 @section General User Options
1114
1115 The following user options, settable with Customize, affect the
1116 general operation of the package.
1117
1118 @defopt url-debug
1119 @cindex debugging
1120 Specifies the types of debug messages the library which are logged to
1121 the @code{*URL-DEBUG*} buffer.
1122 @code{t} means log all messages.
1123 A number means log all messages and show them with @code{message}.
1124 If may also be a list of the types of messages to be logged.
1125 @end defopt
1126 @defopt url-personal-mail-address
1127 @end defopt
1128 @defopt url-privacy-level
1129 @end defopt
1130 @defopt url-uncompressor-alist
1131 @end defopt
1132 @defopt url-passwd-entry-func
1133 @end defopt
1134 @defopt url-standalone-mode
1135 @end defopt
1136 @defopt url-bad-port-list
1137 @end defopt
1138 @defopt url-max-password-attempts
1139 @end defopt
1140 @defopt url-temporary-directory
1141 @end defopt
1142 @defopt url-show-status
1143 @end defopt
1144 @defopt url-confirmation-func
1145 The function to use for asking yes or no functions. This is normally
1146 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1147 function taking a single argument (the prompt) and returning @code{t}
1148 only if an affirmative answer is given.
1149 @end defopt
1150 @defopt url-gateway-method
1151 @c fixme: describe gatewaying
1152 A symbol specifying the type of gateway support to use fro connexions
1153 from the local machine. The supported methods are:
1154
1155 @table @code
1156 @item telnet
1157 Run telnet in a subprocess to connect;
1158 @item rlogin
1159 Rlogin to another machine to connect;
1160 @item socks
1161 Connect through a socks server;
1162 @item ssl
1163 Connect with SSL;
1164 @item native
1165 Connect directly.
1166 @end table
1167 @end defopt
1168
1169 @node Function Index
1170 @unnumbered Command and Function Index
1171 @printindex fn
1172
1173 @node Variable Index
1174 @unnumbered Variable Index
1175 @printindex vr
1176
1177 @node Concept Index
1178 @unnumbered Concept Index
1179 @printindex cp
1180
1181 @setchapternewpage odd
1182 @contents
1183 @bye