Mercurial > emacs
changeset 58830:27baac8434ba
url.texi: New file.
author | Stefan Monnier <monnier@iro.umontreal.ca> |
---|---|
date | Tue, 07 Dec 2004 16:55:48 +0000 |
parents | bf43c774d02c |
children | 408c5135b0a2 |
files | man/ChangeLog man/Makefile.in man/url.texi |
diffstat | 3 files changed, 1196 insertions(+), 2 deletions(-) [+] |
line wrap: on
line diff
--- a/man/ChangeLog Tue Dec 07 16:52:54 2004 +0000 +++ b/man/ChangeLog Tue Dec 07 16:55:48 2004 +0000 @@ -1,3 +1,9 @@ +2004-12-07 Stefan <monnier@iro.umontreal.ca> + + * url.texi: New file. + + * Makefile.in (INFO_TARGETS, DVI_TARGETS, ../info/url, url.dvi): Add it. + 2004-12-06 Jay Belanger <belanger@truman.edu> * calc.texi (Using Calc): Remove paragraph about installation.
--- a/man/Makefile.in Tue Dec 07 16:52:54 2004 +0000 +++ b/man/Makefile.in Tue Dec 07 16:55:48 2004 +0000 @@ -39,7 +39,7 @@ ../info/sc ../info/vip ../info/viper ../info/widget \ ../info/efaq ../info/ada-mode ../info/autotype ../info/calc \ ../info/idlwave ../info/eudc ../info/ebrowse ../info/pcl-cvs \ - ../info/woman ../info/eshell ../info/org \ + ../info/woman ../info/eshell ../info/org ../info/url \ ../info/speedbar ../info/tramp ../info/ses ../info/smtpmail \ ../info/flymake DVI_TARGETS = emacs.dvi calc.dvi cc-mode.dvi cl.dvi dired-x.dvi \ @@ -47,7 +47,7 @@ gnus.dvi message.dvi sieve.dvi pgg.dvi mh-e.dvi \ reftex.dvi sc.dvi vip.dvi viper.dvi widget.dvi faq.dvi \ ada-mode.dvi autotype.dvi idlwave.dvi eudc.dvi ebrowse.dvi \ - pcl-cvs.dvi woman.dvi eshell.dvi org.el \ + pcl-cvs.dvi woman.dvi eshell.dvi org.dvi url.dvi \ speedbar.dvi tramp.dvi ses.dvi smtpmail.dvi flymake.dvi \ emacs-xtra.dvi INFOSOURCES = info.texi @@ -291,6 +291,11 @@ org.dvi: org.texi $(ENVADD) $(TEXI2DVI) ${srcdir}/org.texi +../info/url: url.texi + cd $(srcdir); $(MAKEINFO) url.texi +url.dvi: url.texi + $(ENVADD) $(TEXI2DVI) ${srcdir}/url.texi + ../info/speedbar: speedbar.texi cd $(srcdir); $(MAKEINFO) speedbar.texi speedbar.dvi: speedbar.texi
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/man/url.texi Tue Dec 07 16:55:48 2004 +0000 @@ -0,0 +1,1183 @@ +\input texinfo +@setfilename url.info +@settitle URL Programmer's Manual + +@iftex +@c @finalout +@end iftex +@c @setchapternewpage odd +@c @smallbook + +@tex +\overfullrule=0pt +%\global\baselineskip 30pt % for printing in double space +@end tex +@dircategory World Wide Web +@dircategory GNU Emacs Lisp +@direntry +* URL: (url). URL loading package. +@end direntry + +@ifnottex +This file documents the URL loading package. + +Copyright (C) 1996, 1997, 1998, 1999, 2002, 2004 Free Software Foundation +Copyright (C) 1993, 1994, 1995, 1996 William M. Perry + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being +``GNU GENERAL PUBLIC LICENSE''. A copy of the +license is included in the section entitled ``GNU Free Documentation +License.'' +@end ifnottex + +@c +@titlepage +@sp 6 +@center @titlefont{URL} +@center @titlefont{Programmer's Manual} +@sp 4 +@center First Edition, URL Version 2.0 +@sp 1 +@c @center December 1999 +@sp 5 +@center William M. Perry +@center @email{wmperry@@gnu.org} +@center David Love +@center @email{fx@@gnu.org} +@page +@vskip 0pt plus 1filll +Copyright @copyright{} 1993, 1994, 1995, 1996 William M. Perry@* +Copyright @copyright{} 1996, 1997, 1998, 1999, 2002 Free Software Foundation + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being +``GNU GENERAL PUBLIC LICENSE''. A copy of the +license is included in the section entitled ``GNU Free Documentation +License.'' +@end titlepage +@page +@node Top +@top URL + + + +@menu +* Getting Started:: Preparing your program to use URLs. +* Retrieving URLs:: How to use this package to retrieve a URL. +* Supported URL Types:: Descriptions of URL types currently supported. +* Defining New URLs:: How to define a URL loader for a new protocol. +* General Facilities:: URLs can be cached, accessed via a gateway + and tracked in a history list. +* Customization:: Variables you can alter. +* Function Index:: +* Variable Index:: +* Concept Index:: +@end menu + +@node Getting Started +@chapter Getting Started +@cindex URLs, definition +@cindex URIs + +@dfn{Uniform Resource Locators} (URLs) are a specific form of +@dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which +updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource +agents. + +URIs have the form @var{scheme}:@var{scheme-specific-part}, where the +@var{scheme}s supported by this library are described below. +@xref{Supported URL Types}. + +FTP NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270, +IRC and gopher URLs all have the form + +@example +@var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]} +@end example +@noindent +where @samp{@r{[}} and @samp{@r{]}} delimit optional parts. +@var{userinfo} sometimes takes the form @var{username}:@var{password} +but you should beware of the security risks of sending cleartext +passwords. @var{hostname} may be a domain name or a dotted decimal +address. If the @samp{:@var{port}} is omitted then the library will +use the `well known' port for that service when accessing URLs. With +the possible exception of @code{telnet}, it is rare for ports to be +specified, and it is possible using a non-standard port may have +undesired consequences if a different service is listening on that +port (e.g.@: an HTTP URL specifying the SMTP port can cause mail to be +sent).@c , but @xref{Other Variables, url-bad-port-list}. +The meaning of +the @var{path} component depends on the service. + +The library depends on MIME support provided by the @samp{mm-} +packages from Gnus 5.8 or later. @xref{(emacs-mime)Top, The MIME +library}. + +@menu +* Configuration:: +* Parsed URLs:: URLs are parsed into vector structures. +@end menu + +@node Configuration +@section Configuration + +@defvar url-configuration-directory +@cindex @file{~/.url} +@cindex configuration files +The directory in which URL configuration files, the cache etc., +reside. Default @file{~/.url}. +@end defvar + +@node Parsed URLs +@section Parsed URLs +@cindex parsed URLs +The library functions typically operate on @dfn{parsed} versions of +URLs. These are actually vectors of the form: + +@example +[@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}] +@end example + +@noindent where +@table @var +@item type +is the type of the URL scheme, e.g.@: @code{http} +@item user +is the username associated with it, or @code{nil}; +@item password +is the user password associated with it, or @code{nil}; +@item host +is the host name associated with it, or @code{nil}; +@item port +is the port number associated with it, or @code{nil}; +@item file +is the `file' part of it, or @code{nil}. This doesn't necessarily +actually refer to a file; +@item target +is the target part, or @code{nil}; +@item attributes +is the attributes associated with it, or @code{nil}; +@item full +is @code{t} for a fully-specified URL, with a host part indicated by +@samp{//} after the scheme part. +@end table + +@findex url-type +@findex url-user +@findex url-password +@findex url-host +@findex url-port +@findex url-file +@findex url-target +@findex url-attributes +@findex url-full +@findex url-set-type +@findex url-set-user +@findex url-set-password +@findex url-set-host +@findex url-set-port +@findex url-set-file +@findex url-set-target +@findex url-set-attributes +@findex url-set-full +These attributes have accessors named @code{url-@var{part}}, where +@var{part} is the name of one of the elements above, e.g.@: +@code{url-host}. Similarly, there are setters of the form +@code{url-set-@var{part}}. + +There are functions for parsing and unparsing between the string and +vector forms. + +@defun url-generic-parse-url url +Return a parsed version of the string @var{url}. +@end defun + +@defun url-recreate-url url +@cindex unparsing URLs +Recreates a URL string from the parsed @var{url}. +@end defun + +@node Retrieving URLs +@chapter Retrieving URLs + +@defun url-retrieve-synchronously url +Retrieve @var{url} synchronously and return a buffer containing the +data. @var{url} is either a string or a parsed URL structure. Return +@var{nil} if there are no data associated with it (the case for dired, +info, or mailto URLs that need no further processing). +@end defun + +@defun url-retrieve url callback &optional cbargs +Retrieve @var{url} asynchronously and call @var{callback} with args +@var{cbargs} when finished. The callback is called when the object +has been completely retrieved, with the current buffer containing the +object and any MIME headers associated with it. @var{url} is either a +string or a parsed URL structure. Returns the buffer @var{url} will +load into, or @var{nil} if the process has already completed. +@end defun + +@node Supported URL Types +@chapter Supported URL Types + +@menu +* http/https:: Hypertext Transfer Protocol. +* file/ftp:: Local files and FTP archives. +* info:: Emacs `Info' pages. +* mailto:: Sending email. +* news/nntp/snews:: Usenet news. +* rlogin/telnet/tn3270:: Remote host connectivity. +* irc:: Internet Relay Chat. +* data:: Embedded data URLs. +* nfs:: Networked File System +@c * finger:: +@c * gopher:: +@c * netrek:: +@c * prospero:: +* cid:: Content-ID. +* about:: +* ldap:: Lightweight Directory Access Protocol +* imap:: IMAP mailboxes. +* man:: Unix man pages. +@end menu + +@node http/https +@section @code{http} and @code{https} + +The scheme @code{http} is Hypertext Transfer Protocol. The library +supports version 1.1, specified in RFC 2616. (This supersedes 1.0, +defined in RFC 1945) HTTP URLs have the following form, where most of +the parts are optional: +@example +http://@var{user}:@var{password}@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment} +@end example +@c The @code{:@var{port}} part is optional, and @var{port} defaults to +@c 80. The @code{/@var{path}} part, if present, is a slash-separated +@c series elements. The @code{?@var{searchpart}}, if present, is the +@c query for a search or the content of a form submission. The +@c @code{#fragment} part, if present, is a location in the document. + +The scheme @code{https} is a secure version of @code{http}, with +transmission via SSL. It is defined in RFC 2069. Its default port is +443. This scheme depends on SSL support in Emacs via the +@file{ssl.el} library and is actually implemented by forcing the +@code{ssl} gateway method to be used. @xref{Gateways in general}. + +@defopt url-honor-refresh-requests +This controls honouring of HTTP @samp{Refresh} headers by which +servers can direct clients to reload documents from the same URL or a +or different one. @code{nil} means they will not be honoured, +@code{t} (the default) means they will always be honoured, and +otherwise the user will be asked on each request. +@end defopt + + +@menu +* Cookies:: +* HTTP language/coding:: +* HTTP URL Options:: +* Dealing with HTTP documents:: +@end menu + +@node Cookies +@subsection Cookies + +@defopt url-cookie-file +The file in which cookies are stored, defaulting to @file{cookies} in +the directory specified by @code{url-configuration-directory}. +@end defopt + +@defopt url-cookie-confirmation +Specifies whether confirmation is require to accept cookies. +@end defopt + +@defopt url-cookie-multiple-line +Specifies whether to put all cookies for the server on one line in the +HTTP request to satisfy broken servers like +@url{http://www.hotmail.com}. +@end defopt + +@defopt url-cookie-trusted-urls +A list of regular expressions matching URLs from which to accept +cookies always. +@end defopt + +@defopt url-cookie-untrusted-urls +A list of regular expressions matching URLs from which to reject +cookies always. +@end defopt + +@defopt url-cookie-save-interval +The number of seconds between automatic saves of cookies to disk. +Default is one hour. +@end defopt + + +@node HTTP language/coding +@subsection Language and Encoding Preferences + +HTTP allows clients to express preferences for the language and +encoding of documents which servers may honour. + +@defopt url-mime-charset-string +@cindex character sets +@cindex coding systems +This variable specifies a preference for character sets when documents +can be served in more than one encoding. + +HTTP allows specifying a list of MIME charsets which indicate your +preferred character set encodings, e.g.@: Latin-9 or Big5, and these +can be weighted. In Emacs 21 this list is generated automatically +from the list of defined coding systems which have associated MIME +types. These are sorted by coding priority. @xref{Recognize Coding, +, Recognizing Coding Systems, emacs, GNU Emacs Manual}. +@end defopt + +@defopt url-mime-language-string +@cindex language preferences +A string specifying the preferred language when servers can serve +files in several languages. Use RFC 1766 abbreviations, e.g.@: +@samp{en} for English, @samp{de} for German. It can be a +comma-separated list in descending order of preference. The ordering +can be made explicit using `q' factors defined by HTTP, e.g.@: +@w{@samp{de, en-gb;q=0.8, en;q=0.7}}. It can be @samp{*} to get the +first available language (as opposed to the default). +@end defopt + +@node HTTP URL Options +@subsection HTTP URL Options + +HTTP supports an @samp{OPTIONS} method describing things supported by +the URL@. + +@defun url-http-options url +Returns a property list describing options available for URL. The +property list members are: + +@table @code +@item methods +A list of symbols specifying what HTTP methods the resource +supports. + +@item dav +@cindex DAV +A list of numbers specifying what DAV protocol/schema versions are +supported. + +@item dasl +@cindex DASL +A list of supported DASL search types supported (string form). + +@item ranges +A list of the units available for use in partial document fetches. + +@item p3p +@cindex P3P +The @dfn{Platform For Privacy Protection} description for the resource. +Currently this is just the raw header contents. +@end table + +@end defun + +@node Dealing with HTTP documents +@subsection Dealing with HTTP documents + +HTTP URLs are retrieved into a buffer containing the HTTP headers +followed by the body. Since the headers are quasi-MIME, they may be +processed using the MIME library. @xref{(emacs-mime)Top, The MIME +library}. The MIME library doesn't provide a clean function to do +that, so the URL library does. + +@defun url-decode-text-part handle &optional coding +This function decodes charset-encoded text in the current buffer. In +Emacs, the buffer is expected to be unibyte initially and is set to +multibyte after decoding. +HANDLE is the MIME handle of the original part. CODING is an explicit +coding to use, overriding what the MIME headers specify. +The coding system used for the decoding is returned. + +Note that this function doesn't deal with @samp{http-equiv} charset +specifications in HTML @samp{<meta>} elements. +@end defun + +@node file/ftp +@section file and ftp +@cindex files +@cindex FTP +@cindex File Transfer Protocol +@cindex compressed files +@findex dired + +@example +ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} +file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} +@end example + +These schemes are defined in RFC 1808. +@samp{ftp:} and @samp{file:} are synonomous in this library. They +allow reading arbitary files from hosts. Either @samp{ange-ftp} +(Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote +hosts. Local files are accessed directly. + +Compressed files are handled, but support is hard-coded so that +@code{jka-compr-compression-info-list} and so on have no affect. +Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and +@samp{.bz2}. + +@defopt url-directory-index-file +The filename to look for when indexing a directory, default +@samp{"index.html"}. If this file exists, and is readable, then it +will be viewed instead of using @code{dired} to view the directory. +@end defopt + +@node info +@section info +@cindex Info +@cindex Texinfo +@findex Info-goto-node + +@example +info:@var{file}#@var{node} +@end example + +Info URLs are not officially defined. They invoke +@code{Info-goto-node} with argument @samp{(@var{file})@var{node}}. +@samp{#@var{node}} is optional, defaulting to @samp{Top}. + +@node mailto +@section mailto + +@cindex mailto +@cindex email +A mailto URL will send an email message to the address in the +URL, for example @samp{mailto:foo@@bar.com} would compose a +message to @samp{foo@@bar.com}. + +@defopt url-mail-command +@vindex mail-user-agent +The function called whenever url needs to send mail. This should +normally be left to default from @var{mail-user-agent}. @xref{Mail +Methods, , Mail-Composition Methods, emacs, GNU Emacs Manual}. +@end defopt + +An @samp{X-Url-From} header field containing the URL of the document +that contained the mailto URL is added if that URL is known. + +RFC 2368 extends the definition of mailto URLs in RFC 1738. +The form of a mailto URL is +@example +@samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]} +@end example +@noindent where an arbitary number of @var{header}s can be added. If the +@var{header} is @samp{body}, then @var{contents} is put in the body +otherwise a @var{header} header field is created with @var{contents} +as its contents. Note that the URL library does not consider any +headers `dangerous' so you should check them before sending the +message. + +@c Fixme: update +Email messages are defined in @sc{rfc}822. + +@node news/nntp/snews +@section @code{news}, @code{nntp} and @code{snews} +@cindex news +@cindex network news +@cindex usenet +@cindex NNTP +@cindex snews + +@c draft-gilman-news-url-01 +The network news URL scheme take the following forms following RFC +1738 except that for compatibility with other clients, host and port +fields may be included in news URLs though they are properly only +allowed for nntp an snews. + +@table @samp +@item news:@var{newsgroup} +Retrieves a list of messages in @var{newsgroup}; +@item news:@var{message-id} +Retrieves the message with the given @var{message-id}; +@item news:* +Retrieves a list of all available newsgroups; +@item nntp://@var{host}:@var{port}/@var{newsgroup} +@itemx nntp://@var{host}:@var{port}/@var{message-id} +@itemx nntp://@var{host}:@var{port}/* +Similar to the @samp{news} versions. +@end table + +@samp{:@var{port}} is optional and defaults to :119. + +@samp{snews} is the same as @samp{nntp} except that the default port +is :563. +@cindex SSL +(It is tunnelled through SSL.) + +An @samp{nntp} URL is the same as a news URL, except that the URL may +specify an article by its number. + +@defopt url-news-server +This variable can be used to override the default news server. +Usually this will be set by the Gnus package, which is used to fetch +news. +@cindex environment variable +@vindex NNTPSERVER +It may be set from the conventional environment variable +@code{NNTPSERVER}. +@end defopt + +@node rlogin/telnet/tn3270 +@section rlogin, telnet and tn3270 +@cindex rlogin +@cindex telnet +@cindex tn3270 +@cindex terminal emulation +@findex terminal-emulator + +These URL schemes from RFC 1738 for logon via a terminal emulator have +the form +@example +telnet://@var{user}:@var{password}@@@var{host}:@var{port} +@end example +but the @code{:@var{password}} component is ignored. + +To handle rlogin, telnet and tn3270 URLs, a @code{rlogin}, +@code{telnet} or @code{tn3270} (the program names and arguments are +hardcoded) session is run in a @code{terminal-emulator} buffer. +Well-known ports are used if the URL does not specify a port. + +@node irc +@section irc +@cindex IRC +@cindex Internet Relay Chat +@cindex ZEN IRC +@c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt) +@dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc} +session to a function named in @code{url-irc-function}. + +@defopt url-irc-function +A function to actually open an IRC connection. +This function +must take five arguments, @var{host}, @var{port}, @var{channel}, +@var{user} and @var{password}. The @var{channel} argument specifies the +channel to join immediately, this can be @code{nil}. By default this is +@code{url-irc-zenirc}. +@end defopt +@defun url-irc-zenirc host port channel user password +Processes the arguments and lets @code{zenirc} handle the session. +@end defun + +@node data +@section data +@cindex data URLs + +@example +data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data} +@end example + +Data URLs contain MIME data in the URL itself. They are defined in +RFC 2397. + +@var{media-type} is a MIME @samp{Content-Type} string, possibly +including parameters. It defaults to +@samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be +omitted but the charset parameter supplied. If @samp{;base64} is +present, the @var{data} are base64-encoded. + +@node nfs +@section nfs +@cindex NFS +@cindex Network File System +@cindex automounter + +@example +nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file} +@end example + +The @samp{nfs:} scheme is defined in RFC 2224. It is similar to +@samp{ftp:} except that it points to a file on a remote host that is +handled by the automounter on the local host. + +@defvar url-nfs-automounter-directory-spec +@end defvar +A string saying how to invoke the NFS automounter. Certain @samp{%} +sequences are recognized: + +@table @samp +@item %h +The hostname of the NFS server; +@item %n +The port number of the NFS server; +@item %u +The username to use to authenticate; +@item %p +The password to use to authenticate; +@item %f +The filename on the remote server; +@item %% +A literal @samp{%}. +@end table + +Each can be used any number of times. + +@node cid +@section cid +@cindex Content-ID + +RFC 2111 + +@node about +@section about + +@node ldap +@section ldap +@cindex LDAP +@cindex Lightweight Directory Access Protocol + +The LDAP scheme is defined in RFC 2255. + +@node imap +@section imap +@cindex IMAP + +RFC 2192 + +@node man +@section man +@cindex @command{man} +@cindex Unix man pages +@findex man + +@example +@samp{man:@var{page-spec}} +@end example + +This is a non-standard scheme. @var{page-spec} is passed directly to +the Lisp @code{man} function. + +@node Defining New URLs +@chapter Defining New URLs + +@menu +* Naming conventions:: +* Required functions:: +* Optional functions:: +* Asynchronous fetching:: +* Supporting file-name-handlers:: +@end menu + +@node Naming conventions +@section Naming conventions + +@node Required functions +@section Required functions + +@node Optional functions +@section Optional functions + +@node Asynchronous fetching +@section Asynchronous fetching + +@node Supporting file-name-handlers +@section Supporting file-name-handlers + +@node General Facilities +@chapter General Facilities + +@menu +* Disk Caching:: +* Proxies:: +* Gateways in general:: +* History:: +@end menu + +@node Disk Caching +@section Disk Caching +@cindex Caching +@cindex Persistent Cache +@cindex Disk Cache + +The disk cache stores retrieved documents locally, whence they can be +retrieved more quickly. When requesting a URL that is in the cache, +the library checks to see if the page has changed since it was last +retrieved from the remote machine. If not, the local copy is used, +saving the transmission over the network. +@cindex Cleaning the cache +@cindex Clearing the cache +@cindex Cache cleaning +Currently the cache isn't cleared automatically. +@c Running the @code{clean-cache} shell script +@c fist is recommended, to allow for future cleaning of the cache. This +@c shell script will remove all files that have not been accessed since it +@c was last run. To keep the cache pared down, it is recommended that this +@c script be run from @i{at} or @i{cron} (see the manual pages for +@c crontab(5) or at(1) for more information) + +@defopt url-automatic-caching +Setting this variable non-@code{nil} causes documents to be cached +automatically. +@end defopt + +@defopt url-cache-directory +This variable specifies the +directory to store the cache files. It defaults to sub-directory +@file{cache} of @code{url-configuration-directory}. +@end defopt + +@c Fixme: function v. option, but neither used. +@c @findex url-cache-expired +@c @defopt url-cache-expired +@c This is a function to decide whether or not a cache entry has expired. +@c It takes two times as it parameters and returns non-@code{nil} if the +@c second time is ``too old'' when compared with the first time. +@c @end defopt + +@defopt url-cache-creation-function +The cache relies on a scheme for mapping URLs to files in the cache. +This variable names a function which sets the type of cache to use. +It takes a URL as argument and returns the absolute file name of the +corresponding cache file. The two supplied possibilities are +@code{url-cache-create-filename-using-md5} and +@code{url-cache-create-filename-human-readable}. +@end defopt + +@defun url-cache-create-filename-using-md5 url +Creates a cache file name from @var{url} using MD5 hashing. +@findex md5 +This is creates entries with very few cache collisions and is fast if +you have the @code{md5} function as a primitive (Emacs 21 and XEmacs). +@smallexample +(url-cache-create-filename-using-md5 "http://www.example.com/foo/bar") + @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f" +@end smallexample +@end defun + +@defun url-cache-create-filename-human-readable url +Creates a cache file name from @var{url} more obviously connected to +@var{url} than for @code{url-cache-create-filename-using-md5}, but +more likely to conflict with other files. +@smallexample +(url-cache-create-filename-human-readable "http://www.example.com/foo/bar") + @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar" +@end smallexample +@end defun + +@c Fixme: never actually used currently? +@c @defopt url-standalone-mode +@c @cindex Relying on cache +@c @cindex Cache only mode +@c @cindex Standalone mode +@c If this variable is non-@code{nil}, the library relies solely on the +@c cache for fetching documents and avoids checking if they have changed +@c on remote servers. +@c @end defopt + +@c With a large cache of documents on the local disk, it can be very handy +@c when traveling, or any other time the network connection is not active +@c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely +@c solely on its cache, and avoid checking to see if the page has changed +@c on the remote server. In the case of a dial-on-demand PPP connection, +@c this will keep the phone line free as long as possible, only bringing up +@c the PPP connection when asking for a page that is not located in the +@c cache. This is very useful for demonstrations as well. + +@node Proxies +@section Proxies and Gatewaying + +@c fixme: check/document url-ns stuff +@cindex proxy servers +@cindex proxies +@cindex environment variables +@vindex HTTP_PROXY +Proxy servers are commonly used to provide gateways through firewalls +or as caches serving some more-or-less local network. Each protocol +(HTTP, FTP, etc.)@: can have a different gateway server. Proxying is +conventionally configured commonly amongst different programs through +environment variables of the form @code{@var{protocol}_proxy}, where +@var{protocol} is one of the supported network protocols (@code{http}, +@code{ftp} etc.). The library recognizes such variables in either +upper or lower case. Their values are of one of the forms: +@itemize @bullet +@item @code{@var{host}:@var{port}} +@item A full URL; +@item Simply a host name. +@end itemize + +@vindex NO_PROXY +The @code{NO_PROXY} environment variable specifies URLs that should be +excluded from proxying (on servers that should be contacted directly). +This should be a comma-separated list of hostnames, domain names, or a +mixture of both. Asterisks can be used as wildcards, but other +clients may not support that. Domain names may be indicated by a +leading dot. For example: +@example +NO_PROXY="*.aventail.com,home.com,.seanet.com" +@end example +@noindent says to contact all machines in the @samp{aventail.com} and +@samp{seanet.com} domains directly, as well as the machine named +@samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY} +and @code{no_proxy} are also tried, in that order. + +Proxies may also be specified directly in Lisp. + +@defopt url-proxy-services +This variable is an alist of URL schemes and proxy servers that +gateway them. The items are of the form @w{@code{(@var{scheme} +. @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is +gatewayed through @var{portnumber} on the specified @var{host}. An +exception is the pseudo scheme @code{"no_proxy"}, which is paired with +a regexp matching host names not to be proxied. This variable is +initialized from the environment as above. + +@example +(setq url-proxy-services + '(("http" . "proxy.aventail.com:80") + ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com"))) +@end example +@end defopt + +@node Gateways in general +@section Gateways in General +@cindex gateways +@cindex firewalls + +The library provides a general gateway layer through which all +networking passes. It can both control access to the network and +provide access through gateways in firewalls. This may make direct +connexions in some cases and pass through some sort of gateway in +others.@footnote{Proxies (which only operate over HTTP) are +implemented using this.} The library's basic function responsible for +making connexions is @code{url-open-stream}. + +@defun url-open-stream name buffer host service +@cindex opening a stream +@cindex stream, opening +Open a stream to @var{host}, possibly via a gateway. The other +arguments are as for @code{open-network-stream}. This will not make a +connexion if @code{url-gateway-unplugged} is non-@code{nil}. +@end defun + +@defvar url-gateway-local-host-regexp +This is a regular expression that matches local hosts that do not +require the use of a gateway. If @code{nil}, all connexions are made +through the gateway. +@end defvar + +@defvar url-gateway-method +This variable controls which gateway method is used. It may be useful +to bind it temporarily in some applications. It has values taken from +a list of symbols. Possible values are: + +@table @code +@item telnet +@cindex @command{telnet} +Use this method if you must first telnet and log into a gateway host, +and then run telnet from that host to connect to outside machines. + +@item rlogin +@cindex @command{rlogin} +This method is identical to @code{telnet}, but uses @command{rlogin} +to log into the remote machine without having to send the username and +password over the wire every time. + +@item socks +@cindex @sc{socks} +Use if the firewall has a @sc{socks} gateway running on it. The +@sc{socks} v5 protocol is defined in RFC 1928. + +@c @item ssl +@c This probably shouldn't be documented +@c Fixme: why not? -- fx + +@item native +This method uses Emacs's builtin networking directly. This is the +default. It can be used only if there is no firewall blocking access. +@end table +@end defvar + +The following variables control the gateway methods. + +@defopt url-gateway-telnet-host +The gateway host to telnet to. Once logged in there, you then telnet +out to the hosts you want to connect to. +@end defopt +@defopt url-gateway-telnet-parameters +This should be a list of parameters to pass to the @command{telnet} program. +@end defopt +@defopt url-gateway-telnet-password-prompt +This is a regular expression that matches the password prompt when +logging in. +@end defopt +@defopt url-gateway-telnet-login-prompt +This is a regular expression that matches the username prompt when +logging in. +@end defopt +@defopt url-gateway-telnet-user-name +The username to log in with. +@end defopt +@defopt url-gateway-telnet-password +The password to send when logging in. +@end defopt +@defopt url-gateway-prompt-pattern +This is a regular expression that matches the shell prompt. +@end defopt + +@defopt url-gateway-rlogin-host +Host to @samp{rlogin} to before telnetting out. +@end defopt +@defopt url-gateway-rlogin-parameters +Parametres to pass to @samp{rsh}. +@end defopt +@defopt url-gateway-rlogin-user-name +User name to use when logging in to the gateway. +@end defopt +@defopt url-gateway-prompt-pattern +This is a regular expression that matches the shell prompt. +@end defopt + +@defopt socks-server +This specifies the default server, it takes the form +@w{@code{("Default server" @var{server} @var{port} @var{version})}} +where @var{version} can be either 4 or 5. +@end defopt +@defvar socks-password +If this is @code{nil} then you will be asked for the passward, +otherwise it will be used as the password for authenticating you to +the @sc{socks} server. +@end defvar +@defvar socks-username +This is the username to use when authenticating yourself to the +@sc{socks} server. By default this is your login name. +@end defvar +@defvar socks-timeout +This controls how long, in seconds, to wait for responses from the +@sc{socks} server; it is 5 by default. +@end defvar +@c fixme: these have been effectively commented-out in the code +@c @defopt socks-server-aliases +@c This a list of server aliases. It is a list of aliases of the form +@c @var{(alias hostname port version)}. +@c @end defopt +@c @defopt socks-network-aliases +@c This a list of network aliases. Each entry in the list takes the form +@c @var{(alias (network))} where @var{alias} is a string that names the +@c @var{network}. The networks can contain a pair (not a dotted pair) of +@c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip} +@c address and a netmask, a domain name or a unique hostname or @sc{ip} +@c address. +@c @end defopt +@c @defopt socks-redirection-rules +@c This a list of redirection rules. Each rule take the form +@c @var{(Destination network Connection type)} where @var{Destination +@c network} is a network alias from @code{socks-network-aliases} and +@c @var{Connection type} can be @code{nil} in which case a direct +@c connection is used, or it can be an alias from +@c @code{socks-server-aliases} in which case that server is used as a +@c proxy. +@c @end defopt +@defopt socks-nslookup-program +@cindex @command{nslookup} +This the @samp{nslookup} program. It is @code{"nslookup"} by default. +@end defopt + +@menu +* Suppressing network connexions:: +@end menu +@c * Broken hostname resolution:: + +@node Suppressing network connexions +@subsection Suppressing Network Connexions + +@cindex network connexions, suppressing +@cindex suppressing network connexions +@cindex bugs, HTML +@cindex HTML `bugs' +In some circumstances it is desirable to suppress making network +connexions. A typical case is when rendering HTML in a mail user +agent, when external URLs should not be activated, particularly to +avoid `bugs' which `call home' by fetch single-pixel images and the +like. To arrange this, bind the following variable for the duration +of such processing. + +@defvar url-gateway-unplugged +If this variable is non-@code{nil} new network connexions are never +opened by the URL library. +@end defvar + +@c @node Broken hostname resolution +@c @subsection Broken Hostname Resolution + +@c @cindex hostname resolver +@c @cindex resolver, hostname +@c Some C libraries do not include the hostname resolver routines in +@c their static libraries. If Emacs was linked statically, and was not +@c linked with the resolver libraries, it wil not be able to get to any +@c machines off the local network. This is characterized by being able +@c to reach someplace with a raw ip number, but not its hostname +@c (@url{http://129.79.254.191/} works, but +@c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on +@c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be +@c rebuilt linked against the resolver library, it can use the external +@c @command{nslookup} program instead. + +@c @defopt url-gateway-broken-resolution +@c @cindex @code{nslookup} program +@c @cindex program, @code{nslookup} +@c If non-@code{nil}, this variable says to use the program specified by +@c @code{url-gateway-nslookup-program} program to do hostname resolution. +@c @end defopt + +@c @defopt url-gateway-nslookup-program +@c The name of the program to do hostname lookup if Emacs can't do it +@c directly. This program should expect a single argument on the command +@c line---the hostname to resolve---and should produce output similar to +@c the standard Unix @command{nslookup} program: +@c @example +@c Name: www.cs.indiana.edu +@c Address: 129.79.254.191 +@c @end example +@c @end defopt + +@node History +@section History + +The library can maintain a global history list tracking URLs accessed. +URL completion can be done from it. The history mechanism is set up +@findex url-do-setup +automatically via @code{url-do-setup} when it is configured to be on. +Note that the size of the history list is currently not limited. + +@vindex url-history-hash-table +The history `list' is actually a hash table, +@code{url-history-hash-table}. It contains access times keyed by URL +strings. The times are in the format returned by @code{current-time}. + +@defun url-history-update-url url time +This function updates the hsitory table with an entry for @var{url} +accessed at the gievn @var{time}. +@end defun + +@defopt url-history-track +If non-@code{nil}, the library will keep track of all the URLs +accessed. If is is @code{t}, the list is saved to disk at the end of +each Emacs session. The default is @code{nil}. +@end defopt + +@defopt url-history-file +The file storing the history list between sessions. It defaults to +@file{history} in @code{url-configuration-directory}. +@end defopt + +@defopt url-history-save-interval +@findex url-history-setup-save-timer +The number of seconds between automatic saves of the history list. +Default is one hour. Note that if you change this variable directly, +rather than using Custom, after @code{url-do-setup} has been run, you +need to run the function @code{url-history-setup-save-timer}. +@end defopt + +@defun url-history-parse-history &optional fname +Parses the history file @var{fname} (default @code{url-history-file}) +and sets up the history list. +@end defun + +@defun url-history-save-history &optional fname +Saves the current history to file @var{fname} (default +@code{url-history-file}). +@end defun + +@defun url-completion-function string predicate function +You can use this function to do completion of URLs from the history. +@end defun + +@node Customization +@chapter Customization + +@section Environment Variables + +@cindex environment variables +The following environment variables affect the library's operation at +startup. + +@table @code +@item TMPDIR +@vindex TMPDIR +@vindex url-temporary-directory +If this is defined, @var{url-temporary-directory} is initialized from +it. +@end table + +@section General User Options + +The following user options, settable with Customize, affect the +general operation of the package. + +@defopt url-debug +@cindex debugging +Specifies the types of debug messages the library which are logged to +the @code{*URL-DEBUG*} buffer. +@code{t} means log all messages. +A number means log all messages and show them with @code{message}. +If may also be a list of the types of messages to be logged. +@end defopt +@defopt url-personal-mail-address +@end defopt +@defopt url-privacy-level +@end defopt +@defopt url-uncompressor-alist +@end defopt +@defopt url-passwd-entry-func +@end defopt +@defopt url-standalone-mode +@end defopt +@defopt url-bad-port-list +@end defopt +@defopt url-max-password-attempts +@end defopt +@defopt url-temporary-directory +@end defopt +@defopt url-show-status +@end defopt +@defopt url-confirmation-func +The function to use for asking yes or no functions. This is normally +either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another +function taking a single argument (the prompt) and returning @code{t} +only if an affirmative answer is given. +@end defopt +@defopt url-gateway-method +@c fixme: describe gatewaying +A symbol specifying the type of gateway support to use fro connexions +from the local machine. The supported methods are: + +@table @code +@item telnet +Run telnet in a subprocess to connect; +@item rlogin +Rlogin to another machine to connect; +@item socks +Connect through a socks server; +@item ssl +Connect with SSL; +@item native +Connect directly. +@end table +@end defopt + +@node Function Index +@unnumbered Command and Function Index +@printindex fn + +@node Variable Index +@unnumbered Variable Index +@printindex vr + +@node Concept Index +@unnumbered Concept Index +@printindex cp + +@setchapternewpage odd +@contents +@bye