# HG changeset patch # User Igor TAmara # Date 1226112177 18000 # Node ID b05e35d641e45c9bdafa5c342fef2661a5bba285 # Parent 3e78daaad99bdc3c9e0ab69483ee6acfba6a5293 Copying the files from en to es and taking intro chapter diff -r 3e78daaad99b -r b05e35d641e4 es/Leame.1st --- a/es/Leame.1st Fri Nov 07 21:33:22 2008 -0500 +++ b/es/Leame.1st Fri Nov 07 21:42:57 2008 -0500 @@ -102,6 +102,7 @@ || undo.tex || Igor Támara || 100% || 26/10/2008 || 07/11/2008 || || tour-merge.tex || Javier Rojas || 100% || 28/10/2008 || 03/11/2008 || || concepts.tex || Javier Rojas || 7% || 03/11/2008 || || +|| intro.tex || Igor Támara || 0% || 08/11/2008 || || == Archivos en proceso de revisión == ||'''archivo''' || '''revisor''' ||'''Estado'''||'''Inicio'''|| '''Fin''' || diff -r 3e78daaad99b -r b05e35d641e4 es/cmdref.tex --- a/es/cmdref.tex Fri Nov 07 21:33:22 2008 -0500 +++ b/es/cmdref.tex Fri Nov 07 21:42:57 2008 -0500 @@ -0,0 +1,176 @@ +\chapter{Command reference} +\label{cmdref} + +\cmdref{add}{add files at the next commit} +\optref{add}{I}{include} +\optref{add}{X}{exclude} +\optref{add}{n}{dry-run} + +\cmdref{diff}{print changes in history or working directory} + +Show differences between revisions for the specified files or +directories, using the unified diff format. For a description of the +unified diff format, see section~\ref{sec:mq:patch}. + +By default, this command does not print diffs for files that Mercurial +considers to contain binary data. To control this behaviour, see the +\hgopt{diff}{-a} and \hgopt{diff}{--git} options. + +\subsection{Options} + +\loptref{diff}{nodates} + +Omit date and time information when printing diff headers. + +\optref{diff}{B}{ignore-blank-lines} + +Do not print changes that only insert or delete blank lines. A line +that contains only whitespace is not considered blank. + +\optref{diff}{I}{include} + +Include files and directories whose names match the given patterns. + +\optref{diff}{X}{exclude} + +Exclude files and directories whose names match the given patterns. + +\optref{diff}{a}{text} + +If this option is not specified, \hgcmd{diff} will refuse to print +diffs for files that it detects as binary. Specifying \hgopt{diff}{-a} +forces \hgcmd{diff} to treat all files as text, and generate diffs for +all of them. + +This option is useful for files that are ``mostly text'' but have a +few embedded NUL characters. If you use it on files that contain a +lot of binary data, its output will be incomprehensible. + +\optref{diff}{b}{ignore-space-change} + +Do not print a line if the only change to that line is in the amount +of white space it contains. + +\optref{diff}{g}{git} + +Print \command{git}-compatible diffs. XXX reference a format +description. + +\optref{diff}{p}{show-function} + +Display the name of the enclosing function in a hunk header, using a +simple heuristic. This functionality is enabled by default, so the +\hgopt{diff}{-p} option has no effect unless you change the value of +the \rcitem{diff}{showfunc} config item, as in the following example. +\interaction{cmdref.diff-p} + +\optref{diff}{r}{rev} + +Specify one or more revisions to compare. The \hgcmd{diff} command +accepts up to two \hgopt{diff}{-r} options to specify the revisions to +compare. + +\begin{enumerate} +\setcounter{enumi}{0} +\item Display the differences between the parent revision of the + working directory and the working directory. +\item Display the differences between the specified changeset and the + working directory. +\item Display the differences between the two specified changesets. +\end{enumerate} + +You can specify two revisions using either two \hgopt{diff}{-r} +options or revision range notation. For example, the two revision +specifications below are equivalent. +\begin{codesample2} + hg diff -r 10 -r 20 + hg diff -r10:20 +\end{codesample2} + +When you provide two revisions, Mercurial treats the order of those +revisions as significant. Thus, \hgcmdargs{diff}{-r10:20} will +produce a diff that will transform files from their contents as of +revision~10 to their contents as of revision~20, while +\hgcmdargs{diff}{-r20:10} means the opposite: the diff that will +transform files from their revision~20 contents to their revision~10 +contents. You cannot reverse the ordering in this way if you are +diffing against the working directory. + +\optref{diff}{w}{ignore-all-space} + +\cmdref{version}{print version and copyright information} + +This command displays the version of Mercurial you are running, and +its copyright license. There are four kinds of version string that +you may see. +\begin{itemize} +\item The string ``\texttt{unknown}''. This version of Mercurial was + not built in a Mercurial repository, and cannot determine its own + version. +\item A short numeric string, such as ``\texttt{1.1}''. This is a + build of a revision of Mercurial that was identified by a specific + tag in the repository where it was built. (This doesn't necessarily + mean that you're running an official release; someone else could + have added that tag to any revision in the repository where they + built Mercurial.) +\item A hexadecimal string, such as ``\texttt{875489e31abe}''. This + is a build of the given revision of Mercurial. +\item A hexadecimal string followed by a date, such as + ``\texttt{875489e31abe+20070205}''. This is a build of the given + revision of Mercurial, where the build repository contained some + local changes that had not been committed. +\end{itemize} + +\subsection{Tips and tricks} + +\subsubsection{Why do the results of \hgcmd{diff} and \hgcmd{status} + differ?} +\label{cmdref:diff-vs-status} + +When you run the \hgcmd{status} command, you'll see a list of files +that Mercurial will record changes for the next time you perform a +commit. If you run the \hgcmd{diff} command, you may notice that it +prints diffs for only a \emph{subset} of the files that \hgcmd{status} +listed. There are two possible reasons for this. + +The first is that \hgcmd{status} prints some kinds of modifications +that \hgcmd{diff} doesn't normally display. The \hgcmd{diff} command +normally outputs unified diffs, which don't have the ability to +represent some changes that Mercurial can track. Most notably, +traditional diffs can't represent a change in whether or not a file is +executable, but Mercurial records this information. + +If you use the \hgopt{diff}{--git} option to \hgcmd{diff}, it will +display \command{git}-compatible diffs that \emph{can} display this +extra information. + +The second possible reason that \hgcmd{diff} might be printing diffs +for a subset of the files displayed by \hgcmd{status} is that if you +invoke it without any arguments, \hgcmd{diff} prints diffs against the +first parent of the working directory. If you have run \hgcmd{merge} +to merge two changesets, but you haven't yet committed the results of +the merge, your working directory has two parents (use \hgcmd{parents} +to see them). While \hgcmd{status} prints modifications relative to +\emph{both} parents after an uncommitted merge, \hgcmd{diff} still +operates relative only to the first parent. You can get it to print +diffs relative to the second parent by specifying that parent with the +\hgopt{diff}{-r} option. There is no way to print diffs relative to +both parents. + +\subsubsection{Generating safe binary diffs} + +If you use the \hgopt{diff}{-a} option to force Mercurial to print +diffs of files that are either ``mostly text'' or contain lots of +binary data, those diffs cannot subsequently be applied by either +Mercurial's \hgcmd{import} command or the system's \command{patch} +command. + +If you want to generate a diff of a binary file that is safe to use as +input for \hgcmd{import}, use the \hgcmd{diff}{--git} option when you +generate the patch. The system \command{patch} command cannot handle +binary patches at all. + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 3e78daaad99b -r b05e35d641e4 es/collab.tex --- a/es/collab.tex Fri Nov 07 21:33:22 2008 -0500 +++ b/es/collab.tex Fri Nov 07 21:42:57 2008 -0500 @@ -0,0 +1,1118 @@ +\chapter{Collaborating with other people} +\label{cha:collab} + +As a completely decentralised tool, Mercurial doesn't impose any +policy on how people ought to work with each other. However, if +you're new to distributed revision control, it helps to have some +tools and examples in mind when you're thinking about possible +workflow models. + +\section{Mercurial's web interface} + +Mercurial has a powerful web interface that provides several +useful capabilities. + +For interactive use, the web interface lets you browse a single +repository or a collection of repositories. You can view the history +of a repository, examine each change (comments and diffs), and view +the contents of each directory and file. + +Also for human consumption, the web interface provides an RSS feed of +the changes in a repository. This lets you ``subscribe'' to a +repository using your favourite feed reader, and be automatically +notified of activity in that repository as soon as it happens. I find +this capability much more convenient than the model of subscribing to +a mailing list to which notifications are sent, as it requires no +additional configuration on the part of whoever is serving the +repository. + +The web interface also lets remote users clone a repository, pull +changes from it, and (when the server is configured to permit it) push +changes back to it. Mercurial's HTTP tunneling protocol aggressively +compresses data, so that it works efficiently even over low-bandwidth +network connections. + +The easiest way to get started with the web interface is to use your +web browser to visit an existing repository, such as the master +Mercurial repository at +\url{http://www.selenic.com/repo/hg?style=gitweb}. + +If you're interested in providing a web interface to your own +repositories, Mercurial provides two ways to do this. The first is +using the \hgcmd{serve} command, which is best suited to short-term +``lightweight'' serving. See section~\ref{sec:collab:serve} below for +details of how to use this command. If you have a long-lived +repository that you'd like to make permanently available, Mercurial +has built-in support for the CGI (Common Gateway Interface) standard, +which all common web servers support. See +section~\ref{sec:collab:cgi} for details of CGI configuration. + +\section{Collaboration models} + +With a suitably flexible tool, making decisions about workflow is much +more of a social engineering challenge than a technical one. +Mercurial imposes few limitations on how you can structure the flow of +work in a project, so it's up to you and your group to set up and live +with a model that matches your own particular needs. + +\subsection{Factors to keep in mind} + +The most important aspect of any model that you must keep in mind is +how well it matches the needs and capabilities of the people who will +be using it. This might seem self-evident; even so, you still can't +afford to forget it for a moment. + +I once put together a workflow model that seemed to make perfect sense +to me, but that caused a considerable amount of consternation and +strife within my development team. In spite of my attempts to explain +why we needed a complex set of branches, and how changes ought to flow +between them, a few team members revolted. Even though they were +smart people, they didn't want to pay attention to the constraints we +were operating under, or face the consequences of those constraints in +the details of the model that I was advocating. + +Don't sweep foreseeable social or technical problems under the rug. +Whatever scheme you put into effect, you should plan for mistakes and +problem scenarios. Consider adding automated machinery to prevent, or +quickly recover from, trouble that you can anticipate. As an example, +if you intend to have a branch with not-for-release changes in it, +you'd do well to think early about the possibility that someone might +accidentally merge those changes into a release branch. You could +avoid this particular problem by writing a hook that prevents changes +from being merged from an inappropriate branch. + +\subsection{Informal anarchy} + +I wouldn't suggest an ``anything goes'' approach as something +sustainable, but it's a model that's easy to grasp, and it works +perfectly well in a few unusual situations. + +As one example, many projects have a loose-knit group of collaborators +who rarely physically meet each other. Some groups like to overcome +the isolation of working at a distance by organising occasional +``sprints''. In a sprint, a number of people get together in a single +location (a company's conference room, a hotel meeting room, that kind +of place) and spend several days more or less locked in there, hacking +intensely on a handful of projects. + +A sprint is the perfect place to use the \hgcmd{serve} command, since +\hgcmd{serve} does not requires any fancy server infrastructure. You +can get started with \hgcmd{serve} in moments, by reading +section~\ref{sec:collab:serve} below. Then simply tell the person +next to you that you're running a server, send the URL to them in an +instant message, and you immediately have a quick-turnaround way to +work together. They can type your URL into their web browser and +quickly review your changes; or they can pull a bugfix from you and +verify it; or they can clone a branch containing a new feature and try +it out. + +The charm, and the problem, with doing things in an ad hoc fashion +like this is that only people who know about your changes, and where +they are, can see them. Such an informal approach simply doesn't +scale beyond a handful people, because each individual needs to know +about $n$ different repositories to pull from. + +\subsection{A single central repository} + +For smaller projects migrating from a centralised revision control +tool, perhaps the easiest way to get started is to have changes flow +through a single shared central repository. This is also the +most common ``building block'' for more ambitious workflow schemes. + +Contributors start by cloning a copy of this repository. They can +pull changes from it whenever they need to, and some (perhaps all) +developers have permission to push a change back when they're ready +for other people to see it. + +Under this model, it can still often make sense for people to pull +changes directly from each other, without going through the central +repository. Consider a case in which I have a tentative bug fix, but +I am worried that if I were to publish it to the central repository, +it might subsequently break everyone else's trees as they pull it. To +reduce the potential for damage, I can ask you to clone my repository +into a temporary repository of your own and test it. This lets us put +off publishing the potentially unsafe change until it has had a little +testing. + +In this kind of scenario, people usually use the \command{ssh} +protocol to securely push changes to the central repository, as +documented in section~\ref{sec:collab:ssh}. It's also usual to +publish a read-only copy of the repository over HTTP using CGI, as in +section~\ref{sec:collab:cgi}. Publishing over HTTP satisfies the +needs of people who don't have push access, and those who want to use +web browsers to browse the repository's history. + +\subsection{Working with multiple branches} + +Projects of any significant size naturally tend to make progress on +several fronts simultaneously. In the case of software, it's common +for a project to go through periodic official releases. A release +might then go into ``maintenance mode'' for a while after its first +publication; maintenance releases tend to contain only bug fixes, not +new features. In parallel with these maintenance releases, one or +more future releases may be under development. People normally use +the word ``branch'' to refer to one of these many slightly different +directions in which development is proceeding. + +Mercurial is particularly well suited to managing a number of +simultaneous, but not identical, branches. Each ``development +direction'' can live in its own central repository, and you can merge +changes from one to another as the need arises. Because repositories +are independent of each other, unstable changes in a development +branch will never affect a stable branch unless someone explicitly +merges those changes in. + +Here's an example of how this can work in practice. Let's say you +have one ``main branch'' on a central server. +\interaction{branching.init} +People clone it, make changes locally, test them, and push them back. + +Once the main branch reaches a release milestone, you can use the +\hgcmd{tag} command to give a permanent name to the milestone +revision. +\interaction{branching.tag} +Let's say some ongoing development occurs on the main branch. +\interaction{branching.main} +Using the tag that was recorded at the milestone, people who clone +that repository at any time in the future can use \hgcmd{update} to +get a copy of the working directory exactly as it was when that tagged +revision was committed. +\interaction{branching.update} + +In addition, immediately after the main branch is tagged, someone can +then clone the main branch on the server to a new ``stable'' branch, +also on the server. +\interaction{branching.clone} + +Someone who needs to make a change to the stable branch can then clone +\emph{that} repository, make their changes, commit, and push their +changes back there. +\interaction{branching.stable} +Because Mercurial repositories are independent, and Mercurial doesn't +move changes around automatically, the stable and main branches are +\emph{isolated} from each other. The changes that you made on the +main branch don't ``leak'' to the stable branch, and vice versa. + +You'll often want all of your bugfixes on the stable branch to show up +on the main branch, too. Rather than rewrite a bugfix on the main +branch, you can simply pull and merge changes from the stable to the +main branch, and Mercurial will bring those bugfixes in for you. +\interaction{branching.merge} +The main branch will still contain changes that are not on the stable +branch, but it will also contain all of the bugfixes from the stable +branch. The stable branch remains unaffected by these changes. + +\subsection{Feature branches} + +For larger projects, an effective way to manage change is to break up +a team into smaller groups. Each group has a shared branch of its +own, cloned from a single ``master'' branch used by the entire +project. People working on an individual branch are typically quite +isolated from developments on other branches. + +\begin{figure}[ht] + \centering + \grafix{feature-branches} + \caption{Feature branches} + \label{fig:collab:feature-branches} +\end{figure} + +When a particular feature is deemed to be in suitable shape, someone +on that feature team pulls and merges from the master branch into the +feature branch, then pushes back up to the master branch. + +\subsection{The release train} + +Some projects are organised on a ``train'' basis: a release is +scheduled to happen every few months, and whatever features are ready +when the ``train'' is ready to leave are allowed in. + +This model resembles working with feature branches. The difference is +that when a feature branch misses a train, someone on the feature team +pulls and merges the changes that went out on that train release into +the feature branch, and the team continues its work on top of that +release so that their feature can make the next release. + +\subsection{The Linux kernel model} + +The development of the Linux kernel has a shallow hierarchical +structure, surrounded by a cloud of apparent chaos. Because most +Linux developers use \command{git}, a distributed revision control +tool with capabilities similar to Mercurial, it's useful to describe +the way work flows in that environment; if you like the ideas, the +approach translates well across tools. + +At the center of the community sits Linus Torvalds, the creator of +Linux. He publishes a single source repository that is considered the +``authoritative'' current tree by the entire developer community. +Anyone can clone Linus's tree, but he is very choosy about whose trees +he pulls from. + +Linus has a number of ``trusted lieutenants''. As a general rule, he +pulls whatever changes they publish, in most cases without even +reviewing those changes. Some of those lieutenants are generally +agreed to be ``maintainers'', responsible for specific subsystems +within the kernel. If a random kernel hacker wants to make a change +to a subsystem that they want to end up in Linus's tree, they must +find out who the subsystem's maintainer is, and ask that maintainer to +take their change. If the maintainer reviews their changes and agrees +to take them, they'll pass them along to Linus in due course. + +Individual lieutenants have their own approaches to reviewing, +accepting, and publishing changes; and for deciding when to feed them +to Linus. In addition, there are several well known branches that +people use for different purposes. For example, a few people maintain +``stable'' repositories of older versions of the kernel, to which they +apply critical fixes as needed. Some maintainers publish multiple +trees: one for experimental changes; one for changes that they are +about to feed upstream; and so on. Others just publish a single +tree. + +This model has two notable features. The first is that it's ``pull +only''. You have to ask, convince, or beg another developer to take a +change from you, because there are almost no trees to which more than +one person can push, and there's no way to push changes into a tree +that someone else controls. + +The second is that it's based on reputation and acclaim. If you're an +unknown, Linus will probably ignore changes from you without even +responding. But a subsystem maintainer will probably review them, and +will likely take them if they pass their criteria for suitability. +The more ``good'' changes you contribute to a maintainer, the more +likely they are to trust your judgment and accept your changes. If +you're well-known and maintain a long-lived branch for something Linus +hasn't yet accepted, people with similar interests may pull your +changes regularly to keep up with your work. + +Reputation and acclaim don't necessarily cross subsystem or ``people'' +boundaries. If you're a respected but specialised storage hacker, and +you try to fix a networking bug, that change will receive a level of +scrutiny from a network maintainer comparable to a change from a +complete stranger. + +To people who come from more orderly project backgrounds, the +comparatively chaotic Linux kernel development process often seems +completely insane. It's subject to the whims of individuals; people +make sweeping changes whenever they deem it appropriate; and the pace +of development is astounding. And yet Linux is a highly successful, +well-regarded piece of software. + +\subsection{Pull-only versus shared-push collaboration} + +A perpetual source of heat in the open source community is whether a +development model in which people only ever pull changes from others +is ``better than'' one in which multiple people can push changes to a +shared repository. + +Typically, the backers of the shared-push model use tools that +actively enforce this approach. If you're using a centralised +revision control tool such as Subversion, there's no way to make a +choice over which model you'll use: the tool gives you shared-push, +and if you want to do anything else, you'll have to roll your own +approach on top (such as applying a patch by hand). + +A good distributed revision control tool, such as Mercurial, will +support both models. You and your collaborators can then structure +how you work together based on your own needs and preferences, not on +what contortions your tools force you into. + +\subsection{Where collaboration meets branch management} + +Once you and your team set up some shared repositories and start +propagating changes back and forth between local and shared repos, you +begin to face a related, but slightly different challenge: that of +managing the multiple directions in which your team may be moving at +once. Even though this subject is intimately related to how your team +collaborates, it's dense enough to merit treatment of its own, in +chapter~\ref{chap:branch}. + +\section{The technical side of sharing} + +The remainder of this chapter is devoted to the question of serving +data to your collaborators. + +\section{Informal sharing with \hgcmd{serve}} +\label{sec:collab:serve} + +Mercurial's \hgcmd{serve} command is wonderfully suited to small, +tight-knit, and fast-paced group environments. It also provides a +great way to get a feel for using Mercurial commands over a network. + +Run \hgcmd{serve} inside a repository, and in under a second it will +bring up a specialised HTTP server; this will accept connections from +any client, and serve up data for that repository until you terminate +it. Anyone who knows the URL of the server you just started, and can +talk to your computer over the network, can then use a web browser or +Mercurial to read data from that repository. A URL for a +\hgcmd{serve} instance running on a laptop is likely to look something +like \Verb|http://my-laptop.local:8000/|. + +The \hgcmd{serve} command is \emph{not} a general-purpose web server. +It can do only two things: +\begin{itemize} +\item Allow people to browse the history of the repository it's + serving, from their normal web browsers. +\item Speak Mercurial's wire protocol, so that people can + \hgcmd{clone} or \hgcmd{pull} changes from that repository. +\end{itemize} +In particular, \hgcmd{serve} won't allow remote users to \emph{modify} +your repository. It's intended for read-only use. + +If you're getting started with Mercurial, there's nothing to prevent +you from using \hgcmd{serve} to serve up a repository on your own +computer, then use commands like \hgcmd{clone}, \hgcmd{incoming}, and +so on to talk to that server as if the repository was hosted remotely. +This can help you to quickly get acquainted with using commands on +network-hosted repositories. + +\subsection{A few things to keep in mind} + +Because it provides unauthenticated read access to all clients, you +should only use \hgcmd{serve} in an environment where you either don't +care, or have complete control over, who can access your network and +pull data from your repository. + +The \hgcmd{serve} command knows nothing about any firewall software +you might have installed on your system or network. It cannot detect +or control your firewall software. If other people are unable to talk +to a running \hgcmd{serve} instance, the second thing you should do +(\emph{after} you make sure that they're using the correct URL) is +check your firewall configuration. + +By default, \hgcmd{serve} listens for incoming connections on +port~8000. If another process is already listening on the port you +want to use, you can specify a different port to listen on using the +\hgopt{serve}{-p} option. + +Normally, when \hgcmd{serve} starts, it prints no output, which can be +a bit unnerving. If you'd like to confirm that it is indeed running +correctly, and find out what URL you should send to your +collaborators, start it with the \hggopt{-v} option. + +\section{Using the Secure Shell (ssh) protocol} +\label{sec:collab:ssh} + +You can pull and push changes securely over a network connection using +the Secure Shell (\texttt{ssh}) protocol. To use this successfully, +you may have to do a little bit of configuration on the client or +server sides. + +If you're not familiar with ssh, it's a network protocol that lets you +securely communicate with another computer. To use it with Mercurial, +you'll be setting up one or more user accounts on a server so that +remote users can log in and execute commands. + +(If you \emph{are} familiar with ssh, you'll probably find some of the +material that follows to be elementary in nature.) + +\subsection{How to read and write ssh URLs} + +An ssh URL tends to look like this: +\begin{codesample2} + ssh://bos@hg.serpentine.com:22/hg/hgbook +\end{codesample2} +\begin{enumerate} +\item The ``\texttt{ssh://}'' part tells Mercurial to use the ssh + protocol. +\item The ``\texttt{bos@}'' component indicates what username to log + into the server as. You can leave this out if the remote username + is the same as your local username. +\item The ``\texttt{hg.serpentine.com}'' gives the hostname of the + server to log into. +\item The ``:22'' identifies the port number to connect to the server + on. The default port is~22, so you only need to specify this part + if you're \emph{not} using port~22. +\item The remainder of the URL is the local path to the repository on + the server. +\end{enumerate} + +There's plenty of scope for confusion with the path component of ssh +URLs, as there is no standard way for tools to interpret it. Some +programs behave differently than others when dealing with these paths. +This isn't an ideal situation, but it's unlikely to change. Please +read the following paragraphs carefully. + +Mercurial treats the path to a repository on the server as relative to +the remote user's home directory. For example, if user \texttt{foo} +on the server has a home directory of \dirname{/home/foo}, then an ssh +URL that contains a path component of \dirname{bar} +\emph{really} refers to the directory \dirname{/home/foo/bar}. + +If you want to specify a path relative to another user's home +directory, you can use a path that starts with a tilde character +followed by the user's name (let's call them \texttt{otheruser}), like +this. +\begin{codesample2} + ssh://server/~otheruser/hg/repo +\end{codesample2} + +And if you really want to specify an \emph{absolute} path on the +server, begin the path component with two slashes, as in this example. +\begin{codesample2} + ssh://server//absolute/path +\end{codesample2} + +\subsection{Finding an ssh client for your system} + +Almost every Unix-like system comes with OpenSSH preinstalled. If +you're using such a system, run \Verb|which ssh| to find out if +the \command{ssh} command is installed (it's usually in +\dirname{/usr/bin}). In the unlikely event that it isn't present, +take a look at your system documentation to figure out how to install +it. + +On Windows, you'll first need to choose download a suitable ssh +client. There are two alternatives. +\begin{itemize} +\item Simon Tatham's excellent PuTTY package~\cite{web:putty} provides + a complete suite of ssh client commands. +\item If you have a high tolerance for pain, you can use the Cygwin + port of OpenSSH. +\end{itemize} +In either case, you'll need to edit your \hgini\ file to tell +Mercurial where to find the actual client command. For example, if +you're using PuTTY, you'll need to use the \command{plink} command as +a command-line ssh client. +\begin{codesample2} + [ui] + ssh = C:/path/to/plink.exe -ssh -i "C:/path/to/my/private/key" +\end{codesample2} + +\begin{note} + The path to \command{plink} shouldn't contain any whitespace + characters, or Mercurial may not be able to run it correctly (so + putting it in \dirname{C:\\Program Files} is probably not a good + idea). +\end{note} + +\subsection{Generating a key pair} + +To avoid the need to repetitively type a password every time you need +to use your ssh client, I recommend generating a key pair. On a +Unix-like system, the \command{ssh-keygen} command will do the trick. +On Windows, if you're using PuTTY, the \command{puttygen} command is +what you'll need. + +When you generate a key pair, it's usually \emph{highly} advisable to +protect it with a passphrase. (The only time that you might not want +to do this id when you're using the ssh protocol for automated tasks +on a secure network.) + +Simply generating a key pair isn't enough, however. You'll need to +add the public key to the set of authorised keys for whatever user +you're logging in remotely as. For servers using OpenSSH (the vast +majority), this will mean adding the public key to a list in a file +called \sfilename{authorized\_keys} in their \sdirname{.ssh} +directory. + +On a Unix-like system, your public key will have a \filename{.pub} +extension. If you're using \command{puttygen} on Windows, you can +save the public key to a file of your choosing, or paste it from the +window it's displayed in straight into the +\sfilename{authorized\_keys} file. + +\subsection{Using an authentication agent} + +An authentication agent is a daemon that stores passphrases in memory +(so it will forget passphrases if you log out and log back in again). +An ssh client will notice if it's running, and query it for a +passphrase. If there's no authentication agent running, or the agent +doesn't store the necessary passphrase, you'll have to type your +passphrase every time Mercurial tries to communicate with a server on +your behalf (e.g.~whenever you pull or push changes). + +The downside of storing passphrases in an agent is that it's possible +for a well-prepared attacker to recover the plain text of your +passphrases, in some cases even if your system has been power-cycled. +You should make your own judgment as to whether this is an acceptable +risk. It certainly saves a lot of repeated typing. + +On Unix-like systems, the agent is called \command{ssh-agent}, and +it's often run automatically for you when you log in. You'll need to +use the \command{ssh-add} command to add passphrases to the agent's +store. On Windows, if you're using PuTTY, the \command{pageant} +command acts as the agent. It adds an icon to your system tray that +will let you manage stored passphrases. + +\subsection{Configuring the server side properly} + +Because ssh can be fiddly to set up if you're new to it, there's a +variety of things that can go wrong. Add Mercurial on top, and +there's plenty more scope for head-scratching. Most of these +potential problems occur on the server side, not the client side. The +good news is that once you've gotten a configuration working, it will +usually continue to work indefinitely. + +Before you try using Mercurial to talk to an ssh server, it's best to +make sure that you can use the normal \command{ssh} or \command{putty} +command to talk to the server first. If you run into problems with +using these commands directly, Mercurial surely won't work. Worse, it +will obscure the underlying problem. Any time you want to debug +ssh-related Mercurial problems, you should drop back to making sure +that plain ssh client commands work first, \emph{before} you worry +about whether there's a problem with Mercurial. + +The first thing to be sure of on the server side is that you can +actually log in from another machine at all. If you can't use +\command{ssh} or \command{putty} to log in, the error message you get +may give you a few hints as to what's wrong. The most common problems +are as follows. +\begin{itemize} +\item If you get a ``connection refused'' error, either there isn't an + SSH daemon running on the server at all, or it's inaccessible due to + firewall configuration. +\item If you get a ``no route to host'' error, you either have an + incorrect address for the server or a seriously locked down firewall + that won't admit its existence at all. +\item If you get a ``permission denied'' error, you may have mistyped + the username on the server, or you could have mistyped your key's + passphrase or the remote user's password. +\end{itemize} +In summary, if you're having trouble talking to the server's ssh +daemon, first make sure that one is running at all. On many systems +it will be installed, but disabled, by default. Once you're done with +this step, you should then check that the server's firewall is +configured to allow incoming connections on the port the ssh daemon is +listening on (usually~22). Don't worry about more exotic +possibilities for misconfiguration until you've checked these two +first. + +If you're using an authentication agent on the client side to store +passphrases for your keys, you ought to be able to log into the server +without being prompted for a passphrase or a password. If you're +prompted for a passphrase, there are a few possible culprits. +\begin{itemize} +\item You might have forgotten to use \command{ssh-add} or + \command{pageant} to store the passphrase. +\item You might have stored the passphrase for the wrong key. +\end{itemize} +If you're being prompted for the remote user's password, there are +another few possible problems to check. +\begin{itemize} +\item Either the user's home directory or their \sdirname{.ssh} + directory might have excessively liberal permissions. As a result, + the ssh daemon will not trust or read their + \sfilename{authorized\_keys} file. For example, a group-writable + home or \sdirname{.ssh} directory will often cause this symptom. +\item The user's \sfilename{authorized\_keys} file may have a problem. + If anyone other than the user owns or can write to that file, the + ssh daemon will not trust or read it. +\end{itemize} + +In the ideal world, you should be able to run the following command +successfully, and it should print exactly one line of output, the +current date and time. +\begin{codesample2} + ssh myserver date +\end{codesample2} + +If, on your server, you have login scripts that print banners or other +junk even when running non-interactive commands like this, you should +fix them before you continue, so that they only print output if +they're run interactively. Otherwise these banners will at least +clutter up Mercurial's output. Worse, they could potentially cause +problems with running Mercurial commands remotely. Mercurial makes +tries to detect and ignore banners in non-interactive \command{ssh} +sessions, but it is not foolproof. (If you're editing your login +scripts on your server, the usual way to see if a login script is +running in an interactive shell is to check the return code from the +command \Verb|tty -s|.) + +Once you've verified that plain old ssh is working with your server, +the next step is to ensure that Mercurial runs on the server. The +following command should run successfully: +\begin{codesample2} + ssh myserver hg version +\end{codesample2} +If you see an error message instead of normal \hgcmd{version} output, +this is usually because you haven't installed Mercurial to +\dirname{/usr/bin}. Don't worry if this is the case; you don't need +to do that. But you should check for a few possible problems. +\begin{itemize} +\item Is Mercurial really installed on the server at all? I know this + sounds trivial, but it's worth checking! +\item Maybe your shell's search path (usually set via the \envar{PATH} + environment variable) is simply misconfigured. +\item Perhaps your \envar{PATH} environment variable is only being set + to point to the location of the \command{hg} executable if the login + session is interactive. This can happen if you're setting the path + in the wrong shell login script. See your shell's documentation for + details. +\item The \envar{PYTHONPATH} environment variable may need to contain + the path to the Mercurial Python modules. It might not be set at + all; it could be incorrect; or it may be set only if the login is + interactive. +\end{itemize} + +If you can run \hgcmd{version} over an ssh connection, well done! +You've got the server and client sorted out. You should now be able +to use Mercurial to access repositories hosted by that username on +that server. If you run into problems with Mercurial and ssh at this +point, try using the \hggopt{--debug} option to get a clearer picture +of what's going on. + +\subsection{Using compression with ssh} + +Mercurial does not compress data when it uses the ssh protocol, +because the ssh protocol can transparently compress data. However, +the default behaviour of ssh clients is \emph{not} to request +compression. + +Over any network other than a fast LAN (even a wireless network), +using compression is likely to significantly speed up Mercurial's +network operations. For example, over a WAN, someone measured +compression as reducing the amount of time required to clone a +particularly large repository from~51 minutes to~17 minutes. + +Both \command{ssh} and \command{plink} accept a \cmdopt{ssh}{-C} +option which turns on compression. You can easily edit your \hgrc\ to +enable compression for all of Mercurial's uses of the ssh protocol. +\begin{codesample2} + [ui] + ssh = ssh -C +\end{codesample2} + +If you use \command{ssh}, you can configure it to always use +compression when talking to your server. To do this, edit your +\sfilename{.ssh/config} file (which may not yet exist), as follows. +\begin{codesample2} + Host hg + Compression yes + HostName hg.example.com +\end{codesample2} +This defines an alias, \texttt{hg}. When you use it on the +\command{ssh} command line or in a Mercurial \texttt{ssh}-protocol +URL, it will cause \command{ssh} to connect to \texttt{hg.example.com} +and use compression. This gives you both a shorter name to type and +compression, each of which is a good thing in its own right. + +\section{Serving over HTTP using CGI} +\label{sec:collab:cgi} + +Depending on how ambitious you are, configuring Mercurial's CGI +interface can take anything from a few moments to several hours. + +We'll begin with the simplest of examples, and work our way towards a +more complex configuration. Even for the most basic case, you're +almost certainly going to need to read and modify your web server's +configuration. + +\begin{note} + Configuring a web server is a complex, fiddly, and highly + system-dependent activity. I can't possibly give you instructions + that will cover anything like all of the cases you will encounter. + Please use your discretion and judgment in following the sections + below. Be prepared to make plenty of mistakes, and to spend a lot + of time reading your server's error logs. +\end{note} + +\subsection{Web server configuration checklist} + +Before you continue, do take a few moments to check a few aspects of +your system's setup. + +\begin{enumerate} +\item Do you have a web server installed at all? Mac OS X ships with + Apache, but many other systems may not have a web server installed. +\item If you have a web server installed, is it actually running? On + most systems, even if one is present, it will be disabled by + default. +\item Is your server configured to allow you to run CGI programs in + the directory where you plan to do so? Most servers default to + explicitly disabling the ability to run CGI programs. +\end{enumerate} + +If you don't have a web server installed, and don't have substantial +experience configuring Apache, you should consider using the +\texttt{lighttpd} web server instead of Apache. Apache has a +well-deserved reputation for baroque and confusing configuration. +While \texttt{lighttpd} is less capable in some ways than Apache, most +of these capabilities are not relevant to serving Mercurial +repositories. And \texttt{lighttpd} is undeniably \emph{much} easier +to get started with than Apache. + +\subsection{Basic CGI configuration} + +On Unix-like systems, it's common for users to have a subdirectory +named something like \dirname{public\_html} in their home directory, +from which they can serve up web pages. A file named \filename{foo} +in this directory will be accessible at a URL of the form +\texttt{http://www.example.com/\~username/foo}. + +To get started, find the \sfilename{hgweb.cgi} script that should be +present in your Mercurial installation. If you can't quickly find a +local copy on your system, simply download one from the master +Mercurial repository at +\url{http://www.selenic.com/repo/hg/raw-file/tip/hgweb.cgi}. + +You'll need to copy this script into your \dirname{public\_html} +directory, and ensure that it's executable. +\begin{codesample2} + cp .../hgweb.cgi ~/public_html + chmod 755 ~/public_html/hgweb.cgi +\end{codesample2} +The \texttt{755} argument to \command{chmod} is a little more general +than just making the script executable: it ensures that the script is +executable by anyone, and that ``group'' and ``other'' write +permissions are \emph{not} set. If you were to leave those write +permissions enabled, Apache's \texttt{suexec} subsystem would likely +refuse to execute the script. In fact, \texttt{suexec} also insists +that the \emph{directory} in which the script resides must not be +writable by others. +\begin{codesample2} + chmod 755 ~/public_html +\end{codesample2} + +\subsubsection{What could \emph{possibly} go wrong?} +\label{sec:collab:wtf} + +Once you've copied the CGI script into place, go into a web browser, +and try to open the URL \url{http://myhostname/~myuser/hgweb.cgi}, +\emph{but} brace yourself for instant failure. There's a high +probability that trying to visit this URL will fail, and there are +many possible reasons for this. In fact, you're likely to stumble +over almost every one of the possible errors below, so please read +carefully. The following are all of the problems I ran into on a +system running Fedora~7, with a fresh installation of Apache, and a +user account that I created specially to perform this exercise. + +Your web server may have per-user directories disabled. If you're +using Apache, search your config file for a \texttt{UserDir} +directive. If there's none present, per-user directories will be +disabled. If one exists, but its value is \texttt{disabled}, then +per-user directories will be disabled. Otherwise, the string after +\texttt{UserDir} gives the name of the subdirectory that Apache will +look in under your home directory, for example \dirname{public\_html}. + +Your file access permissions may be too restrictive. The web server +must be able to traverse your home directory and directories under +your \dirname{public\_html} directory, and read files under the latter +too. Here's a quick recipe to help you to make your permissions more +appropriate. +\begin{codesample2} + chmod 755 ~ + find ~/public_html -type d -print0 | xargs -0r chmod 755 + find ~/public_html -type f -print0 | xargs -0r chmod 644 +\end{codesample2} + +The other possibility with permissions is that you might get a +completely empty window when you try to load the script. In this +case, it's likely that your access permissions are \emph{too + permissive}. Apache's \texttt{suexec} subsystem won't execute a +script that's group-~or world-writable, for example. + +Your web server may be configured to disallow execution of CGI +programs in your per-user web directory. Here's Apache's +default per-user configuration from my Fedora system. +\begin{codesample2} + + AllowOverride FileInfo AuthConfig Limit + Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec + + Order allow,deny + Allow from all + + + Order deny,allow + Deny from all + + +\end{codesample2} +If you find a similar-looking \texttt{Directory} group in your Apache +configuration, the directive to look at inside it is \texttt{Options}. +Add \texttt{ExecCGI} to the end of this list if it's missing, and +restart the web server. + +If you find that Apache serves you the text of the CGI script instead +of executing it, you may need to either uncomment (if already present) +or add a directive like this. +\begin{codesample2} + AddHandler cgi-script .cgi +\end{codesample2} + +The next possibility is that you might be served with a colourful +Python backtrace claiming that it can't import a +\texttt{mercurial}-related module. This is actually progress! The +server is now capable of executing your CGI script. This error is +only likely to occur if you're running a private installation of +Mercurial, instead of a system-wide version. Remember that the web +server runs the CGI program without any of the environment variables +that you take for granted in an interactive session. If this error +happens to you, edit your copy of \sfilename{hgweb.cgi} and follow the +directions inside it to correctly set your \envar{PYTHONPATH} +environment variable. + +Finally, you are \emph{certain} to by served with another colourful +Python backtrace: this one will complain that it can't find +\dirname{/path/to/repository}. Edit your \sfilename{hgweb.cgi} script +and replace the \dirname{/path/to/repository} string with the complete +path to the repository you want to serve up. + +At this point, when you try to reload the page, you should be +presented with a nice HTML view of your repository's history. Whew! + +\subsubsection{Configuring lighttpd} + +To be exhaustive in my experiments, I tried configuring the +increasingly popular \texttt{lighttpd} web server to serve the same +repository as I described with Apache above. I had already overcome +all of the problems I outlined with Apache, many of which are not +server-specific. As a result, I was fairly sure that my file and +directory permissions were good, and that my \sfilename{hgweb.cgi} +script was properly edited. + +Once I had Apache running, getting \texttt{lighttpd} to serve the +repository was a snap (in other words, even if you're trying to use +\texttt{lighttpd}, you should read the Apache section). I first had +to edit the \texttt{mod\_access} section of its config file to enable +\texttt{mod\_cgi} and \texttt{mod\_userdir}, both of which were +disabled by default on my system. I then added a few lines to the end +of the config file, to configure these modules. +\begin{codesample2} + userdir.path = "public_html" + cgi.assign = ( ".cgi" => "" ) +\end{codesample2} +With this done, \texttt{lighttpd} ran immediately for me. If I had +configured \texttt{lighttpd} before Apache, I'd almost certainly have +run into many of the same system-level configuration problems as I did +with Apache. However, I found \texttt{lighttpd} to be noticeably +easier to configure than Apache, even though I've used Apache for over +a decade, and this was my first exposure to \texttt{lighttpd}. + +\subsection{Sharing multiple repositories with one CGI script} + +The \sfilename{hgweb.cgi} script only lets you publish a single +repository, which is an annoying restriction. If you want to publish +more than one without wracking yourself with multiple copies of the +same script, each with different names, a better choice is to use the +\sfilename{hgwebdir.cgi} script. + +The procedure to configure \sfilename{hgwebdir.cgi} is only a little +more involved than for \sfilename{hgweb.cgi}. First, you must obtain +a copy of the script. If you don't have one handy, you can download a +copy from the master Mercurial repository at +\url{http://www.selenic.com/repo/hg/raw-file/tip/hgwebdir.cgi}. + +You'll need to copy this script into your \dirname{public\_html} +directory, and ensure that it's executable. +\begin{codesample2} + cp .../hgwebdir.cgi ~/public_html + chmod 755 ~/public_html ~/public_html/hgwebdir.cgi +\end{codesample2} +With basic configuration out of the way, try to visit +\url{http://myhostname/~myuser/hgwebdir.cgi} in your browser. It +should display an empty list of repositories. If you get a blank +window or error message, try walking through the list of potential +problems in section~\ref{sec:collab:wtf}. + +The \sfilename{hgwebdir.cgi} script relies on an external +configuration file. By default, it searches for a file named +\sfilename{hgweb.config} in the same directory as itself. You'll need +to create this file, and make it world-readable. The format of the +file is similar to a Windows ``ini'' file, as understood by Python's +\texttt{ConfigParser}~\cite{web:configparser} module. + +The easiest way to configure \sfilename{hgwebdir.cgi} is with a +section named \texttt{collections}. This will automatically publish +\emph{every} repository under the directories you name. The section +should look like this: +\begin{codesample2} + [collections] + /my/root = /my/root +\end{codesample2} +Mercurial interprets this by looking at the directory name on the +\emph{right} hand side of the ``\texttt{=}'' sign; finding +repositories in that directory hierarchy; and using the text on the +\emph{left} to strip off matching text from the names it will actually +list in the web interface. The remaining component of a path after +this stripping has occurred is called a ``virtual path''. + +Given the example above, if we have a repository whose local path is +\dirname{/my/root/this/repo}, the CGI script will strip the leading +\dirname{/my/root} from the name, and publish the repository with a +virtual path of \dirname{this/repo}. If the base URL for our CGI +script is \url{http://myhostname/~myuser/hgwebdir.cgi}, the complete +URL for that repository will be +\url{http://myhostname/~myuser/hgwebdir.cgi/this/repo}. + +If we replace \dirname{/my/root} on the left hand side of this example +with \dirname{/my}, then \sfilename{hgwebdir.cgi} will only strip off +\dirname{/my} from the repository name, and will give us a virtual +path of \dirname{root/this/repo} instead of \dirname{this/repo}. + +The \sfilename{hgwebdir.cgi} script will recursively search each +directory listed in the \texttt{collections} section of its +configuration file, but it will \texttt{not} recurse into the +repositories it finds. + +The \texttt{collections} mechanism makes it easy to publish many +repositories in a ``fire and forget'' manner. You only need to set up +the CGI script and configuration file one time. Afterwards, you can +publish or unpublish a repository at any time by simply moving it +into, or out of, the directory hierarchy in which you've configured +\sfilename{hgwebdir.cgi} to look. + +\subsubsection{Explicitly specifying which repositories to publish} + +In addition to the \texttt{collections} mechanism, the +\sfilename{hgwebdir.cgi} script allows you to publish a specific list +of repositories. To do so, create a \texttt{paths} section, with +contents of the following form. +\begin{codesample2} + [paths] + repo1 = /my/path/to/some/repo + repo2 = /some/path/to/another +\end{codesample2} +In this case, the virtual path (the component that will appear in a +URL) is on the left hand side of each definition, while the path to +the repository is on the right. Notice that there does not need to be +any relationship between the virtual path you choose and the location +of a repository in your filesystem. + +If you wish, you can use both the \texttt{collections} and +\texttt{paths} mechanisms simultaneously in a single configuration +file. + +\begin{note} + If multiple repositories have the same virtual path, + \sfilename{hgwebdir.cgi} will not report an error. Instead, it will + behave unpredictably. +\end{note} + +\subsection{Downloading source archives} + +Mercurial's web interface lets users download an archive of any +revision. This archive will contain a snapshot of the working +directory as of that revision, but it will not contain a copy of the +repository data. + +By default, this feature is not enabled. To enable it, you'll need to +add an \rcitem{web}{allow\_archive} item to the \rcsection{web} +section of your \hgrc. + +\subsection{Web configuration options} + +Mercurial's web interfaces (the \hgcmd{serve} command, and the +\sfilename{hgweb.cgi} and \sfilename{hgwebdir.cgi} scripts) have a +number of configuration options that you can set. These belong in a +section named \rcsection{web}. +\begin{itemize} +\item[\rcitem{web}{allow\_archive}] Determines which (if any) archive + download mechanisms Mercurial supports. If you enable this + feature, users of the web interface will be able to download an + archive of whatever revision of a repository they are viewing. + To enable the archive feature, this item must take the form of a + sequence of words drawn from the list below. + \begin{itemize} + \item[\texttt{bz2}] A \command{tar} archive, compressed using + \texttt{bzip2} compression. This has the best compression ratio, + but uses the most CPU time on the server. + \item[\texttt{gz}] A \command{tar} archive, compressed using + \texttt{gzip} compression. + \item[\texttt{zip}] A \command{zip} archive, compressed using LZW + compression. This format has the worst compression ratio, but is + widely used in the Windows world. + \end{itemize} + If you provide an empty list, or don't have an + \rcitem{web}{allow\_archive} entry at all, this feature will be + disabled. Here is an example of how to enable all three supported + formats. + \begin{codesample4} + [web] + allow_archive = bz2 gz zip + \end{codesample4} +\item[\rcitem{web}{allowpull}] Boolean. Determines whether the web + interface allows remote users to \hgcmd{pull} and \hgcmd{clone} this + repository over~HTTP. If set to \texttt{no} or \texttt{false}, only + the ``human-oriented'' portion of the web interface is available. +\item[\rcitem{web}{contact}] String. A free-form (but preferably + brief) string identifying the person or group in charge of the + repository. This often contains the name and email address of a + person or mailing list. It often makes sense to place this entry in + a repository's own \sfilename{.hg/hgrc} file, but it can make sense + to use in a global \hgrc\ if every repository has a single + maintainer. +\item[\rcitem{web}{maxchanges}] Integer. The default maximum number + of changesets to display in a single page of output. +\item[\rcitem{web}{maxfiles}] Integer. The default maximum number + of modified files to display in a single page of output. +\item[\rcitem{web}{stripes}] Integer. If the web interface displays + alternating ``stripes'' to make it easier to visually align rows + when you are looking at a table, this number controls the number of + rows in each stripe. +\item[\rcitem{web}{style}] Controls the template Mercurial uses to + display the web interface. Mercurial ships with two web templates, + named \texttt{default} and \texttt{gitweb} (the latter is much more + visually attractive). You can also specify a custom template of + your own; see chapter~\ref{chap:template} for details. Here, you + can see how to enable the \texttt{gitweb} style. + \begin{codesample4} + [web] + style = gitweb + \end{codesample4} +\item[\rcitem{web}{templates}] Path. The directory in which to search + for template files. By default, Mercurial searches in the directory + in which it was installed. +\end{itemize} +If you are using \sfilename{hgwebdir.cgi}, you can place a few +configuration items in a \rcsection{web} section of the +\sfilename{hgweb.config} file instead of a \hgrc\ file, for +convenience. These items are \rcitem{web}{motd} and +\rcitem{web}{style}. + +\subsubsection{Options specific to an individual repository} + +A few \rcsection{web} configuration items ought to be placed in a +repository's local \sfilename{.hg/hgrc}, rather than a user's or +global \hgrc. +\begin{itemize} +\item[\rcitem{web}{description}] String. A free-form (but preferably + brief) string that describes the contents or purpose of the + repository. +\item[\rcitem{web}{name}] String. The name to use for the repository + in the web interface. This overrides the default name, which is the + last component of the repository's path. +\end{itemize} + +\subsubsection{Options specific to the \hgcmd{serve} command} + +Some of the items in the \rcsection{web} section of a \hgrc\ file are +only for use with the \hgcmd{serve} command. +\begin{itemize} +\item[\rcitem{web}{accesslog}] Path. The name of a file into which to + write an access log. By default, the \hgcmd{serve} command writes + this information to standard output, not to a file. Log entries are + written in the standard ``combined'' file format used by almost all + web servers. +\item[\rcitem{web}{address}] String. The local address on which the + server should listen for incoming connections. By default, the + server listens on all addresses. +\item[\rcitem{web}{errorlog}] Path. The name of a file into which to + write an error log. By default, the \hgcmd{serve} command writes this + information to standard error, not to a file. +\item[\rcitem{web}{ipv6}] Boolean. Whether to use the IPv6 protocol. + By default, IPv6 is not used. +\item[\rcitem{web}{port}] Integer. The TCP~port number on which the + server should listen. The default port number used is~8000. +\end{itemize} + +\subsubsection{Choosing the right \hgrc\ file to add \rcsection{web} + items to} + +It is important to remember that a web server like Apache or +\texttt{lighttpd} will run under a user~ID that is different to yours. +CGI scripts run by your server, such as \sfilename{hgweb.cgi}, will +usually also run under that user~ID. + +If you add \rcsection{web} items to your own personal \hgrc\ file, CGI +scripts won't read that \hgrc\ file. Those settings will thus only +affect the behaviour of the \hgcmd{serve} command when you run it. To +cause CGI scripts to see your settings, either create a \hgrc\ file in +the home directory of the user ID that runs your web server, or add +those settings to a system-wide \hgrc\ file. + + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 3e78daaad99b -r b05e35d641e4 es/filenames.tex --- a/es/filenames.tex Fri Nov 07 21:33:22 2008 -0500 +++ b/es/filenames.tex Fri Nov 07 21:42:57 2008 -0500 @@ -0,0 +1,306 @@ +\chapter{File names and pattern matching} +\label{chap:names} + +Mercurial provides mechanisms that let you work with file names in a +consistent and expressive way. + +\section{Simple file naming} + +Mercurial uses a unified piece of machinery ``under the hood'' to +handle file names. Every command behaves uniformly with respect to +file names. The way in which commands work with file names is as +follows. + +If you explicitly name real files on the command line, Mercurial works +with exactly those files, as you would expect. +\interaction{filenames.files} + +When you provide a directory name, Mercurial will interpret this as +``operate on every file in this directory and its subdirectories''. +Mercurial traverses the files and subdirectories in a directory in +alphabetical order. When it encounters a subdirectory, it will +traverse that subdirectory before continuing with the current +directory. +\interaction{filenames.dirs} + +\section{Running commands without any file names} + +Mercurial's commands that work with file names have useful default +behaviours when you invoke them without providing any file names or +patterns. What kind of behaviour you should expect depends on what +the command does. Here are a few rules of thumb you can use to +predict what a command is likely to do if you don't give it any names +to work with. +\begin{itemize} +\item Most commands will operate on the entire working directory. + This is what the \hgcmd{add} command does, for example. +\item If the command has effects that are difficult or impossible to + reverse, it will force you to explicitly provide at least one name + or pattern (see below). This protects you from accidentally + deleting files by running \hgcmd{remove} with no arguments, for + example. +\end{itemize} + +It's easy to work around these default behaviours if they don't suit +you. If a command normally operates on the whole working directory, +you can invoke it on just the current directory and its subdirectories +by giving it the name ``\dirname{.}''. +\interaction{filenames.wdir-subdir} + +Along the same lines, some commands normally print file names relative +to the root of the repository, even if you're invoking them from a +subdirectory. Such a command will print file names relative to your +subdirectory if you give it explicit names. Here, we're going to run +\hgcmd{status} from a subdirectory, and get it to operate on the +entire working directory while printing file names relative to our +subdirectory, by passing it the output of the \hgcmd{root} command. +\interaction{filenames.wdir-relname} + +\section{Telling you what's going on} + +The \hgcmd{add} example in the preceding section illustrates something +else that's helpful about Mercurial commands. If a command operates +on a file that you didn't name explicitly on the command line, it will +usually print the name of the file, so that you will not be surprised +what's going on. + +The principle here is of \emph{least surprise}. If you've exactly +named a file on the command line, there's no point in repeating it +back at you. If Mercurial is acting on a file \emph{implicitly}, +because you provided no names, or a directory, or a pattern (see +below), it's safest to tell you what it's doing. + +For commands that behave this way, you can silence them using the +\hggopt{-q} option. You can also get them to print the name of every +file, even those you've named explicitly, using the \hggopt{-v} +option. + +\section{Using patterns to identify files} + +In addition to working with file and directory names, Mercurial lets +you use \emph{patterns} to identify files. Mercurial's pattern +handling is expressive. + +On Unix-like systems (Linux, MacOS, etc.), the job of matching file +names to patterns normally falls to the shell. On these systems, you +must explicitly tell Mercurial that a name is a pattern. On Windows, +the shell does not expand patterns, so Mercurial will automatically +identify names that are patterns, and expand them for you. + +To provide a pattern in place of a regular name on the command line, +the mechanism is simple: +\begin{codesample2} + syntax:patternbody +\end{codesample2} +That is, a pattern is identified by a short text string that says what +kind of pattern this is, followed by a colon, followed by the actual +pattern. + +Mercurial supports two kinds of pattern syntax. The most frequently +used is called \texttt{glob}; this is the same kind of pattern +matching used by the Unix shell, and should be familiar to Windows +command prompt users, too. + +When Mercurial does automatic pattern matching on Windows, it uses +\texttt{glob} syntax. You can thus omit the ``\texttt{glob:}'' prefix +on Windows, but it's safe to use it, too. + +The \texttt{re} syntax is more powerful; it lets you specify patterns +using regular expressions, also known as regexps. + +By the way, in the examples that follow, notice that I'm careful to +wrap all of my patterns in quote characters, so that they won't get +expanded by the shell before Mercurial sees them. + +\subsection{Shell-style \texttt{glob} patterns} + +This is an overview of the kinds of patterns you can use when you're +matching on glob patterns. + +The ``\texttt{*}'' character matches any string, within a single +directory. +\interaction{filenames.glob.star} + +The ``\texttt{**}'' pattern matches any string, and crosses directory +boundaries. It's not a standard Unix glob token, but it's accepted by +several popular Unix shells, and is very useful. +\interaction{filenames.glob.starstar} + +The ``\texttt{?}'' pattern matches any single character. +\interaction{filenames.glob.question} + +The ``\texttt{[}'' character begins a \emph{character class}. This +matches any single character within the class. The class ends with a +``\texttt{]}'' character. A class may contain multiple \emph{range}s +of the form ``\texttt{a-f}'', which is shorthand for +``\texttt{abcdef}''. +\interaction{filenames.glob.range} +If the first character after the ``\texttt{[}'' in a character class +is a ``\texttt{!}'', it \emph{negates} the class, making it match any +single character not in the class. + +A ``\texttt{\{}'' begins a group of subpatterns, where the whole group +matches if any subpattern in the group matches. The ``\texttt{,}'' +character separates subpatterns, and ``\texttt{\}}'' ends the group. +\interaction{filenames.glob.group} + +\subsubsection{Watch out!} + +Don't forget that if you want to match a pattern in any directory, you +should not be using the ``\texttt{*}'' match-any token, as this will +only match within one directory. Instead, use the ``\texttt{**}'' +token. This small example illustrates the difference between the two. +\interaction{filenames.glob.star-starstar} + +\subsection{Regular expression matching with \texttt{re} patterns} + +Mercurial accepts the same regular expression syntax as the Python +programming language (it uses Python's regexp engine internally). +This is based on the Perl language's regexp syntax, which is the most +popular dialect in use (it's also used in Java, for example). + +I won't discuss Mercurial's regexp dialect in any detail here, as +regexps are not often used. Perl-style regexps are in any case +already exhaustively documented on a multitude of web sites, and in +many books. Instead, I will focus here on a few things you should +know if you find yourself needing to use regexps with Mercurial. + +A regexp is matched against an entire file name, relative to the root +of the repository. In other words, even if you're already in +subbdirectory \dirname{foo}, if you want to match files under this +directory, your pattern must start with ``\texttt{foo/}''. + +One thing to note, if you're familiar with Perl-style regexps, is that +Mercurial's are \emph{rooted}. That is, a regexp starts matching +against the beginning of a string; it doesn't look for a match +anywhere within the string. To match anywhere in a string, start +your pattern with ``\texttt{.*}''. + +\section{Filtering files} + +Not only does Mercurial give you a variety of ways to specify files; +it lets you further winnow those files using \emph{filters}. Commands +that work with file names accept two filtering options. +\begin{itemize} +\item \hggopt{-I}, or \hggopt{--include}, lets you specify a pattern + that file names must match in order to be processed. +\item \hggopt{-X}, or \hggopt{--exclude}, gives you a way to + \emph{avoid} processing files, if they match this pattern. +\end{itemize} +You can provide multiple \hggopt{-I} and \hggopt{-X} options on the +command line, and intermix them as you please. Mercurial interprets +the patterns you provide using glob syntax by default (but you can use +regexps if you need to). + +You can read a \hggopt{-I} filter as ``process only the files that +match this filter''. +\interaction{filenames.filter.include} +The \hggopt{-X} filter is best read as ``process only the files that +don't match this pattern''. +\interaction{filenames.filter.exclude} + +\section{Ignoring unwanted files and directories} + +XXX. + +\section{Case sensitivity} +\label{sec:names:case} + +If you're working in a mixed development environment that contains +both Linux (or other Unix) systems and Macs or Windows systems, you +should keep in the back of your mind the knowledge that they treat the +case (``N'' versus ``n'') of file names in incompatible ways. This is +not very likely to affect you, and it's easy to deal with if it does, +but it could surprise you if you don't know about it. + +Operating systems and filesystems differ in the way they handle the +\emph{case} of characters in file and directory names. There are +three common ways to handle case in names. +\begin{itemize} +\item Completely case insensitive. Uppercase and lowercase versions + of a letter are treated as identical, both when creating a file and + during subsequent accesses. This is common on older DOS-based + systems. +\item Case preserving, but insensitive. When a file or directory is + created, the case of its name is stored, and can be retrieved and + displayed by the operating system. When an existing file is being + looked up, its case is ignored. This is the standard arrangement on + Windows and MacOS. The names \filename{foo} and \filename{FoO} + identify the same file. This treatment of uppercase and lowercase + letters as interchangeable is also referred to as \emph{case + folding}. +\item Case sensitive. The case of a name is significant at all times. + The names \filename{foo} and {FoO} identify different files. This + is the way Linux and Unix systems normally work. +\end{itemize} + +On Unix-like systems, it is possible to have any or all of the above +ways of handling case in action at once. For example, if you use a +USB thumb drive formatted with a FAT32 filesystem on a Linux system, +Linux will handle names on that filesystem in a case preserving, but +insensitive, way. + +\subsection{Safe, portable repository storage} + +Mercurial's repository storage mechanism is \emph{case safe}. It +translates file names so that they can be safely stored on both case +sensitive and case insensitive filesystems. This means that you can +use normal file copying tools to transfer a Mercurial repository onto, +for example, a USB thumb drive, and safely move that drive and +repository back and forth between a Mac, a PC running Windows, and a +Linux box. + +\subsection{Detecting case conflicts} + +When operating in the working directory, Mercurial honours the naming +policy of the filesystem where the working directory is located. If +the filesystem is case preserving, but insensitive, Mercurial will +treat names that differ only in case as the same. + +An important aspect of this approach is that it is possible to commit +a changeset on a case sensitive (typically Linux or Unix) filesystem +that will cause trouble for users on case insensitive (usually Windows +and MacOS) users. If a Linux user commits changes to two files, one +named \filename{myfile.c} and the other named \filename{MyFile.C}, +they will be stored correctly in the repository. And in the working +directories of other Linux users, they will be correctly represented +as separate files. + +If a Windows or Mac user pulls this change, they will not initially +have a problem, because Mercurial's repository storage mechanism is +case safe. However, once they try to \hgcmd{update} the working +directory to that changeset, or \hgcmd{merge} with that changeset, +Mercurial will spot the conflict between the two file names that the +filesystem would treat as the same, and forbid the update or merge +from occurring. + +\subsection{Fixing a case conflict} + +If you are using Windows or a Mac in a mixed environment where some of +your collaborators are using Linux or Unix, and Mercurial reports a +case folding conflict when you try to \hgcmd{update} or \hgcmd{merge}, +the procedure to fix the problem is simple. + +Just find a nearby Linux or Unix box, clone the problem repository +onto it, and use Mercurial's \hgcmd{rename} command to change the +names of any offending files or directories so that they will no +longer cause case folding conflicts. Commit this change, \hgcmd{pull} +or \hgcmd{push} it across to your Windows or MacOS system, and +\hgcmd{update} to the revision with the non-conflicting names. + +The changeset with case-conflicting names will remain in your +project's history, and you still won't be able to \hgcmd{update} your +working directory to that changeset on a Windows or MacOS system, but +you can continue development unimpeded. + +\begin{note} + Prior to version~0.9.3, Mercurial did not use a case safe repository + storage mechanism, and did not detect case folding conflicts. If + you are using an older version of Mercurial on Windows or MacOS, I + strongly recommend that you upgrade. +\end{note} + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 3e78daaad99b -r b05e35d641e4 es/intro.tex --- a/es/intro.tex Fri Nov 07 21:33:22 2008 -0500 +++ b/es/intro.tex Fri Nov 07 21:42:57 2008 -0500 @@ -0,0 +1,561 @@ +\chapter{Introduction} +\label{chap:intro} + +\section{About revision control} + +Revision control is the process of managing multiple versions of a +piece of information. In its simplest form, this is something that +many people do by hand: every time you modify a file, save it under a +new name that contains a number, each one higher than the number of +the preceding version. + +Manually managing multiple versions of even a single file is an +error-prone task, though, so software tools to help automate this +process have long been available. The earliest automated revision +control tools were intended to help a single user to manage revisions +of a single file. Over the past few decades, the scope of revision +control tools has expanded greatly; they now manage multiple files, +and help multiple people to work together. The best modern revision +control tools have no problem coping with thousands of people working +together on projects that consist of hundreds of thousands of files. + +\subsection{Why use revision control?} + +There are a number of reasons why you or your team might want to use +an automated revision control tool for a project. +\begin{itemize} +\item It will track the history and evolution of your project, so you + don't have to. For every change, you'll have a log of \emph{who} + made it; \emph{why} they made it; \emph{when} they made it; and + \emph{what} the change was. +\item When you're working with other people, revision control software + makes it easier for you to collaborate. For example, when people + more or less simultaneously make potentially incompatible changes, + the software will help you to identify and resolve those conflicts. +\item It can help you to recover from mistakes. If you make a change + that later turns out to be in error, you can revert to an earlier + version of one or more files. In fact, a \emph{really} good + revision control tool will even help you to efficiently figure out + exactly when a problem was introduced (see + section~\ref{sec:undo:bisect} for details). +\item It will help you to work simultaneously on, and manage the drift + between, multiple versions of your project. +\end{itemize} +Most of these reasons are equally valid---at least in theory---whether +you're working on a project by yourself, or with a hundred other +people. + +A key question about the practicality of revision control at these two +different scales (``lone hacker'' and ``huge team'') is how its +\emph{benefits} compare to its \emph{costs}. A revision control tool +that's difficult to understand or use is going to impose a high cost. + +A five-hundred-person project is likely to collapse under its own +weight almost immediately without a revision control tool and process. +In this case, the cost of using revision control might hardly seem +worth considering, since \emph{without} it, failure is almost +guaranteed. + +On the other hand, a one-person ``quick hack'' might seem like a poor +place to use a revision control tool, because surely the cost of using +one must be close to the overall cost of the project. Right? + +Mercurial uniquely supports \emph{both} of these scales of +development. You can learn the basics in just a few minutes, and due +to its low overhead, you can apply revision control to the smallest of +projects with ease. Its simplicity means you won't have a lot of +abstruse concepts or command sequences competing for mental space with +whatever you're \emph{really} trying to do. At the same time, +Mercurial's high performance and peer-to-peer nature let you scale +painlessly to handle large projects. + +No revision control tool can rescue a poorly run project, but a good +choice of tools can make a huge difference to the fluidity with which +you can work on a project. + +\subsection{The many names of revision control} + +Revision control is a diverse field, so much so that it doesn't +actually have a single name or acronym. Here are a few of the more +common names and acronyms you'll encounter: +\begin{itemize} +\item Revision control (RCS) +\item Software configuration management (SCM), or configuration management +\item Source code management +\item Source code control, or source control +\item Version control (VCS) +\end{itemize} +Some people claim that these terms actually have different meanings, +but in practice they overlap so much that there's no agreed or even +useful way to tease them apart. + +\section{A short history of revision control} + +The best known of the old-time revision control tools is SCCS (Source +Code Control System), which Marc Rochkind wrote at Bell Labs, in the +early 1970s. SCCS operated on individual files, and required every +person working on a project to have access to a shared workspace on a +single system. Only one person could modify a file at any time; +arbitration for access to files was via locks. It was common for +people to lock files, and later forget to unlock them, preventing +anyone else from modifying those files without the help of an +administrator. + +Walter Tichy developed a free alternative to SCCS in the early 1980s; +he called his program RCS (Revison Control System). Like SCCS, RCS +required developers to work in a single shared workspace, and to lock +files to prevent multiple people from modifying them simultaneously. + +Later in the 1980s, Dick Grune used RCS as a building block for a set +of shell scripts he initially called cmt, but then renamed to CVS +(Concurrent Versions System). The big innovation of CVS was that it +let developers work simultaneously and somewhat independently in their +own personal workspaces. The personal workspaces prevented developers +from stepping on each other's toes all the time, as was common with +SCCS and RCS. Each developer had a copy of every project file, and +could modify their copies independently. They had to merge their +edits prior to committing changes to the central repository. + +Brian Berliner took Grune's original scripts and rewrote them in~C, +releasing in 1989 the code that has since developed into the modern +version of CVS. CVS subsequently acquired the ability to operate over +a network connection, giving it a client/server architecture. CVS's +architecture is centralised; only the server has a copy of the history +of the project. Client workspaces just contain copies of recent +versions of the project's files, and a little metadata to tell them +where the server is. CVS has been enormously successful; it is +probably the world's most widely used revision control system. + +In the early 1990s, Sun Microsystems developed an early distributed +revision control system, called TeamWare. A TeamWare workspace +contains a complete copy of the project's history. TeamWare has no +notion of a central repository. (CVS relied upon RCS for its history +storage; TeamWare used SCCS.) + +As the 1990s progressed, awareness grew of a number of problems with +CVS. It records simultaneous changes to multiple files individually, +instead of grouping them together as a single logically atomic +operation. It does not manage its file hierarchy well; it is easy to +make a mess of a repository by renaming files and directories. Worse, +its source code is difficult to read and maintain, which made the +``pain level'' of fixing these architectural problems prohibitive. + +In 2001, Jim Blandy and Karl Fogel, two developers who had worked on +CVS, started a project to replace it with a tool that would have a +better architecture and cleaner code. The result, Subversion, does +not stray from CVS's centralised client/server model, but it adds +multi-file atomic commits, better namespace management, and a number +of other features that make it a generally better tool than CVS. +Since its initial release, it has rapidly grown in popularity. + +More or less simultaneously, Graydon Hoare began working on an +ambitious distributed revision control system that he named Monotone. +While Monotone addresses many of CVS's design flaws and has a +peer-to-peer architecture, it goes beyond earlier (and subsequent) +revision control tools in a number of innovative ways. It uses +cryptographic hashes as identifiers, and has an integral notion of +``trust'' for code from different sources. + +Mercurial began life in 2005. While a few aspects of its design are +influenced by Monotone, Mercurial focuses on ease of use, high +performance, and scalability to very large projects. + +\section{Trends in revision control} + +There has been an unmistakable trend in the development and use of +revision control tools over the past four decades, as people have +become familiar with the capabilities of their tools and constrained +by their limitations. + +The first generation began by managing single files on individual +computers. Although these tools represented a huge advance over +ad-hoc manual revision control, their locking model and reliance on a +single computer limited them to small, tightly-knit teams. + +The second generation loosened these constraints by moving to +network-centered architectures, and managing entire projects at a +time. As projects grew larger, they ran into new problems. With +clients needing to talk to servers very frequently, server scaling +became an issue for large projects. An unreliable network connection +could prevent remote users from being able to talk to the server at +all. As open source projects started making read-only access +available anonymously to anyone, people without commit privileges +found that they could not use the tools to interact with a project in +a natural way, as they could not record their changes. + +The current generation of revision control tools is peer-to-peer in +nature. All of these systems have dropped the dependency on a single +central server, and allow people to distribute their revision control +data to where it's actually needed. Collaboration over the Internet +has moved from constrained by technology to a matter of choice and +consensus. Modern tools can operate offline indefinitely and +autonomously, with a network connection only needed when syncing +changes with another repository. + +\section{A few of the advantages of distributed revision control} + +Even though distributed revision control tools have for several years +been as robust and usable as their previous-generation counterparts, +people using older tools have not yet necessarily woken up to their +advantages. There are a number of ways in which distributed tools +shine relative to centralised ones. + +For an individual developer, distributed tools are almost always much +faster than centralised tools. This is for a simple reason: a +centralised tool needs to talk over the network for many common +operations, because most metadata is stored in a single copy on the +central server. A distributed tool stores all of its metadata +locally. All else being equal, talking over the network adds overhead +to a centralised tool. Don't underestimate the value of a snappy, +responsive tool: you're going to spend a lot of time interacting with +your revision control software. + +Distributed tools are indifferent to the vagaries of your server +infrastructure, again because they replicate metadata to so many +locations. If you use a centralised system and your server catches +fire, you'd better hope that your backup media are reliable, and that +your last backup was recent and actually worked. With a distributed +tool, you have many backups available on every contributor's computer. + +The reliability of your network will affect distributed tools far less +than it will centralised tools. You can't even use a centralised tool +without a network connection, except for a few highly constrained +commands. With a distributed tool, if your network connection goes +down while you're working, you may not even notice. The only thing +you won't be able to do is talk to repositories on other computers, +something that is relatively rare compared with local operations. If +you have a far-flung team of collaborators, this may be significant. + +\subsection{Advantages for open source projects} + +If you take a shine to an open source project and decide that you +would like to start hacking on it, and that project uses a distributed +revision control tool, you are at once a peer with the people who +consider themselves the ``core'' of that project. If they publish +their repositories, you can immediately copy their project history, +start making changes, and record your work, using the same tools in +the same ways as insiders. By contrast, with a centralised tool, you +must use the software in a ``read only'' mode unless someone grants +you permission to commit changes to their central server. Until then, +you won't be able to record changes, and your local modifications will +be at risk of corruption any time you try to update your client's view +of the repository. + +\subsubsection{The forking non-problem} + +It has been suggested that distributed revision control tools pose +some sort of risk to open source projects because they make it easy to +``fork'' the development of a project. A fork happens when there are +differences in opinion or attitude between groups of developers that +cause them to decide that they can't work together any longer. Each +side takes a more or less complete copy of the project's source code, +and goes off in its own direction. + +Sometimes the camps in a fork decide to reconcile their differences. +With a centralised revision control system, the \emph{technical} +process of reconciliation is painful, and has to be performed largely +by hand. You have to decide whose revision history is going to +``win'', and graft the other team's changes into the tree somehow. +This usually loses some or all of one side's revision history. + +What distributed tools do with respect to forking is they make forking +the \emph{only} way to develop a project. Every single change that +you make is potentially a fork point. The great strength of this +approach is that a distributed revision control tool has to be really +good at \emph{merging} forks, because forks are absolutely +fundamental: they happen all the time. + +If every piece of work that everybody does, all the time, is framed in +terms of forking and merging, then what the open source world refers +to as a ``fork'' becomes \emph{purely} a social issue. If anything, +distributed tools \emph{lower} the likelihood of a fork: +\begin{itemize} +\item They eliminate the social distinction that centralised tools + impose: that between insiders (people with commit access) and + outsiders (people without). +\item They make it easier to reconcile after a social fork, because + all that's involved from the perspective of the revision control + software is just another merge. +\end{itemize} + +Some people resist distributed tools because they want to retain tight +control over their projects, and they believe that centralised tools +give them this control. However, if you're of this belief, and you +publish your CVS or Subversion repositories publically, there are +plenty of tools available that can pull out your entire project's +history (albeit slowly) and recreate it somewhere that you don't +control. So while your control in this case is illusory, you are +forgoing the ability to fluidly collaborate with whatever people feel +compelled to mirror and fork your history. + +\subsection{Advantages for commercial projects} + +Many commercial projects are undertaken by teams that are scattered +across the globe. Contributors who are far from a central server will +see slower command execution and perhaps less reliability. Commercial +revision control systems attempt to ameliorate these problems with +remote-site replication add-ons that are typically expensive to buy +and cantankerous to administer. A distributed system doesn't suffer +from these problems in the first place. Better yet, you can easily +set up multiple authoritative servers, say one per site, so that +there's no redundant communication between repositories over expensive +long-haul network links. + +Centralised revision control systems tend to have relatively low +scalability. It's not unusual for an expensive centralised system to +fall over under the combined load of just a few dozen concurrent +users. Once again, the typical response tends to be an expensive and +clunky replication facility. Since the load on a central server---if +you have one at all---is many times lower with a distributed +tool (because all of the data is replicated everywhere), a single +cheap server can handle the needs of a much larger team, and +replication to balance load becomes a simple matter of scripting. + +If you have an employee in the field, troubleshooting a problem at a +customer's site, they'll benefit from distributed revision control. +The tool will let them generate custom builds, try different fixes in +isolation from each other, and search efficiently through history for +the sources of bugs and regressions in the customer's environment, all +without needing to connect to your company's network. + +\section{Why choose Mercurial?} + +Mercurial has a unique set of properties that make it a particularly +good choice as a revision control system. +\begin{itemize} +\item It is easy to learn and use. +\item It is lightweight. +\item It scales excellently. +\item It is easy to customise. +\end{itemize} + +If you are at all familiar with revision control systems, you should +be able to get up and running with Mercurial in less than five +minutes. Even if not, it will take no more than a few minutes +longer. Mercurial's command and feature sets are generally uniform +and consistent, so you can keep track of a few general rules instead +of a host of exceptions. + +On a small project, you can start working with Mercurial in moments. +Creating new changes and branches; transferring changes around +(whether locally or over a network); and history and status operations +are all fast. Mercurial attempts to stay nimble and largely out of +your way by combining low cognitive overhead with blazingly fast +operations. + +The usefulness of Mercurial is not limited to small projects: it is +used by projects with hundreds to thousands of contributors, each +containing tens of thousands of files and hundreds of megabytes of +source code. + +If the core functionality of Mercurial is not enough for you, it's +easy to build on. Mercurial is well suited to scripting tasks, and +its clean internals and implementation in Python make it easy to add +features in the form of extensions. There are a number of popular and +useful extensions already available, ranging from helping to identify +bugs to improving performance. + +\section{Mercurial compared with other tools} + +Before you read on, please understand that this section necessarily +reflects my own experiences, interests, and (dare I say it) biases. I +have used every one of the revision control tools listed below, in +most cases for several years at a time. + + +\subsection{Subversion} + +Subversion is a popular revision control tool, developed to replace +CVS. It has a centralised client/server architecture. + +Subversion and Mercurial have similarly named commands for performing +the same operations, so if you're familiar with one, it is easy to +learn to use the other. Both tools are portable to all popular +operating systems. + +Prior to version 1.5, Subversion had no useful support for merges. +At the time of writing, its merge tracking capability is new, and known to be +\href{http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword}{complicated + and buggy}. + +Mercurial has a substantial performance advantage over Subversion on +every revision control operation I have benchmarked. I have measured +its advantage as ranging from a factor of two to a factor of six when +compared with Subversion~1.4.3's \emph{ra\_local} file store, which is +the fastest access method available. In more realistic deployments +involving a network-based store, Subversion will be at a substantially +larger disadvantage. Because many Subversion commands must talk to +the server and Subversion does not have useful replication facilities, +server capacity and network bandwidth become bottlenecks for modestly +large projects. + +Additionally, Subversion incurs substantial storage overhead to avoid +network transactions for a few common operations, such as finding +modified files (\texttt{status}) and displaying modifications against +the current revision (\texttt{diff}). As a result, a Subversion +working copy is often the same size as, or larger than, a Mercurial +repository and working directory, even though the Mercurial repository +contains a complete history of the project. + +Subversion is widely supported by third party tools. Mercurial +currently lags considerably in this area. This gap is closing, +however, and indeed some of Mercurial's GUI tools now outshine their +Subversion equivalents. Like Mercurial, Subversion has an excellent +user manual. + +Because Subversion doesn't store revision history on the client, it is +well suited to managing projects that deal with lots of large, opaque +binary files. If you check in fifty revisions to an incompressible +10MB file, Subversion's client-side space usage stays constant The +space used by any distributed SCM will grow rapidly in proportion to +the number of revisions, because the differences between each revision +are large. + +In addition, it's often difficult or, more usually, impossible to +merge different versions of a binary file. Subversion's ability to +let a user lock a file, so that they temporarily have the exclusive +right to commit changes to it, can be a significant advantage to a +project where binary files are widely used. + +Mercurial can import revision history from a Subversion repository. +It can also export revision history to a Subversion repository. This +makes it easy to ``test the waters'' and use Mercurial and Subversion +in parallel before deciding to switch. History conversion is +incremental, so you can perform an initial conversion, then small +additional conversions afterwards to bring in new changes. + + +\subsection{Git} + +Git is a distributed revision control tool that was developed for +managing the Linux kernel source tree. Like Mercurial, its early +design was somewhat influenced by Monotone. + +Git has a very large command set, with version~1.5.0 providing~139 +individual commands. It has something of a reputation for being +difficult to learn. Compared to Git, Mercurial has a strong focus on +simplicity. + +In terms of performance, Git is extremely fast. In several cases, it +is faster than Mercurial, at least on Linux, while Mercurial performs +better on other operations. However, on Windows, the performance and +general level of support that Git provides is, at the time of writing, +far behind that of Mercurial. + +While a Mercurial repository needs no maintenance, a Git repository +requires frequent manual ``repacks'' of its metadata. Without these, +performance degrades, while space usage grows rapidly. A server that +contains many Git repositories that are not rigorously and frequently +repacked will become heavily disk-bound during backups, and there have +been instances of daily backups taking far longer than~24 hours as a +result. A freshly packed Git repository is slightly smaller than a +Mercurial repository, but an unpacked repository is several orders of +magnitude larger. + +The core of Git is written in C. Many Git commands are implemented as +shell or Perl scripts, and the quality of these scripts varies widely. +I have encountered several instances where scripts charged along +blindly in the presence of errors that should have been fatal. + +Mercurial can import revision history from a Git repository. + + +\subsection{CVS} + +CVS is probably the most widely used revision control tool in the +world. Due to its age and internal untidiness, it has been only +lightly maintained for many years. + +It has a centralised client/server architecture. It does not group +related file changes into atomic commits, making it easy for people to +``break the build'': one person can successfully commit part of a +change and then be blocked by the need for a merge, causing other +people to see only a portion of the work they intended to do. This +also affects how you work with project history. If you want to see +all of the modifications someone made as part of a task, you will need +to manually inspect the descriptions and timestamps of the changes +made to each file involved (if you even know what those files were). + +CVS has a muddled notion of tags and branches that I will not attempt +to even describe. It does not support renaming of files or +directories well, making it easy to corrupt a repository. It has +almost no internal consistency checking capabilities, so it is usually +not even possible to tell whether or how a repository is corrupt. I +would not recommend CVS for any project, existing or new. + +Mercurial can import CVS revision history. However, there are a few +caveats that apply; these are true of every other revision control +tool's CVS importer, too. Due to CVS's lack of atomic changes and +unversioned filesystem hierarchy, it is not possible to reconstruct +CVS history completely accurately; some guesswork is involved, and +renames will usually not show up. Because a lot of advanced CVS +administration has to be done by hand and is hence error-prone, it's +common for CVS importers to run into multiple problems with corrupted +repositories (completely bogus revision timestamps and files that have +remained locked for over a decade are just two of the less interesting +problems I can recall from personal experience). + +Mercurial can import revision history from a CVS repository. + + +\subsection{Commercial tools} + +Perforce has a centralised client/server architecture, with no +client-side caching of any data. Unlike modern revision control +tools, Perforce requires that a user run a command to inform the +server about every file they intend to edit. + +The performance of Perforce is quite good for small teams, but it +falls off rapidly as the number of users grows beyond a few dozen. +Modestly large Perforce installations require the deployment of +proxies to cope with the load their users generate. + + +\subsection{Choosing a revision control tool} + +With the exception of CVS, all of the tools listed above have unique +strengths that suit them to particular styles of work. There is no +single revision control tool that is best in all situations. + +As an example, Subversion is a good choice for working with frequently +edited binary files, due to its centralised nature and support for +file locking. + +I personally find Mercurial's properties of simplicity, performance, +and good merge support to be a compelling combination that has served +me well for several years. + + +\section{Switching from another tool to Mercurial} + +Mercurial is bundled with an extension named \hgext{convert}, which +can incrementally import revision history from several other revision +control tools. By ``incremental'', I mean that you can convert all of +a project's history to date in one go, then rerun the conversion later +to obtain new changes that happened after the initial conversion. + +The revision control tools supported by \hgext{convert} are as +follows: +\begin{itemize} +\item Subversion +\item CVS +\item Git +\item Darcs +\end{itemize} + +In addition, \hgext{convert} can export changes from Mercurial to +Subversion. This makes it possible to try Subversion and Mercurial in +parallel before committing to a switchover, without risking the loss +of any work. + +The \hgxcmd{conver}{convert} command is easy to use. Simply point it +at the path or URL of the source repository, optionally give it the +name of the destination repository, and it will start working. After +the initial conversion, just run the same command again to import new +changes. + + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 3e78daaad99b -r b05e35d641e4 es/srcinstall.tex --- a/es/srcinstall.tex Fri Nov 07 21:33:22 2008 -0500 +++ b/es/srcinstall.tex Fri Nov 07 21:42:57 2008 -0500 @@ -0,0 +1,53 @@ +\chapter{Installing Mercurial from source} +\label{chap:srcinstall} + +\section{On a Unix-like system} +\label{sec:srcinstall:unixlike} + +If you are using a Unix-like system that has a sufficiently recent +version of Python (2.3~or newer) available, it is easy to install +Mercurial from source. +\begin{enumerate} +\item Download a recent source tarball from + \url{http://www.selenic.com/mercurial/download}. +\item Unpack the tarball: + \begin{codesample4} + gzip -dc mercurial-\emph{version}.tar.gz | tar xf - + \end{codesample4} +\item Go into the source directory and run the installer script. This + will build Mercurial and install it in your home directory. + \begin{codesample4} + cd mercurial-\emph{version} + python setup.py install --force --home=\$HOME + \end{codesample4} +\end{enumerate} +Once the install finishes, Mercurial will be in the \texttt{bin} +subdirectory of your home directory. Don't forget to make sure that +this directory is present in your shell's search path. + +You will probably need to set the \envar{PYTHONPATH} environment +variable so that the Mercurial executable can find the rest of the +Mercurial packages. For example, on my laptop, I have set it to +\texttt{/home/bos/lib/python}. The exact path that you will need to +use depends on how Python was built for your system, but should be +easy to figure out. If you're uncertain, look through the output of +the installer script above, and see where the contents of the +\texttt{mercurial} directory were installed to. + +\section{On Windows} + +Building and installing Mercurial on Windows requires a variety of +tools, a fair amount of technical knowledge, and considerable +patience. I very much \emph{do not recommend} this route if you are a +``casual user''. Unless you intend to hack on Mercurial, I strongly +suggest that you use a binary package instead. + +If you are intent on building Mercurial from source on Windows, follow +the ``hard way'' directions on the Mercurial wiki at +\url{http://www.selenic.com/mercurial/wiki/index.cgi/WindowsInstall}, +and expect the process to involve a lot of fiddly work. + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 3e78daaad99b -r b05e35d641e4 es/template.tex --- a/es/template.tex Fri Nov 07 21:33:22 2008 -0500 +++ b/es/template.tex Fri Nov 07 21:42:57 2008 -0500 @@ -0,0 +1,475 @@ +\chapter{Customising the output of Mercurial} +\label{chap:template} + +Mercurial provides a powerful mechanism to let you control how it +displays information. The mechanism is based on templates. You can +use templates to generate specific output for a single command, or to +customise the entire appearance of the built-in web interface. + +\section{Using precanned output styles} +\label{sec:style} + +Packaged with Mercurial are some output styles that you can use +immediately. A style is simply a precanned template that someone +wrote and installed somewhere that Mercurial can find. + +Before we take a look at Mercurial's bundled styles, let's review its +normal output. + +\interaction{template.simple.normal} + +This is somewhat informative, but it takes up a lot of space---five +lines of output per changeset. The \texttt{compact} style reduces +this to three lines, presented in a sparse manner. + +\interaction{template.simple.compact} + +The \texttt{changelog} style hints at the expressive power of +Mercurial's templating engine. This style attempts to follow the GNU +Project's changelog guidelines\cite{web:changelog}. + +\interaction{template.simple.changelog} + +You will not be shocked to learn that Mercurial's default output style +is named \texttt{default}. + +\subsection{Setting a default style} + +You can modify the output style that Mercurial will use for every +command by editing your \hgrc\ file, naming the style you would +prefer to use. + +\begin{codesample2} + [ui] + style = compact +\end{codesample2} + +If you write a style of your own, you can use it by either providing +the path to your style file, or copying your style file into a +location where Mercurial can find it (typically the \texttt{templates} +subdirectory of your Mercurial install directory). + +\section{Commands that support styles and templates} + +All of Mercurial's ``\texttt{log}-like'' commands let you use styles +and templates: \hgcmd{incoming}, \hgcmd{log}, \hgcmd{outgoing}, and +\hgcmd{tip}. + +As I write this manual, these are so far the only commands that +support styles and templates. Since these are the most important +commands that need customisable output, there has been little pressure +from the Mercurial user community to add style and template support to +other commands. + +\section{The basics of templating} + +At its simplest, a Mercurial template is a piece of text. Some of the +text never changes, while other parts are \emph{expanded}, or replaced +with new text, when necessary. + +Before we continue, let's look again at a simple example of +Mercurial's normal output. + +\interaction{template.simple.normal} + +Now, let's run the same command, but using a template to change its +output. + +\interaction{template.simple.simplest} + +The example above illustrates the simplest possible template; it's +just a piece of static text, printed once for each changeset. The +\hgopt{log}{--template} option to the \hgcmd{log} command tells +Mercurial to use the given text as the template when printing each +changeset. + +Notice that the template string above ends with the text +``\Verb+\n+''. This is an \emph{escape sequence}, telling Mercurial +to print a newline at the end of each template item. If you omit this +newline, Mercurial will run each piece of output together. See +section~\ref{sec:template:escape} for more details of escape sequences. + +A template that prints a fixed string of text all the time isn't very +useful; let's try something a bit more complex. + +\interaction{template.simple.simplesub} + +As you can see, the string ``\Verb+{desc}+'' in the template has been +replaced in the output with the description of each changeset. Every +time Mercurial finds text enclosed in curly braces (``\texttt{\{}'' +and ``\texttt{\}}''), it will try to replace the braces and text with +the expansion of whatever is inside. To print a literal curly brace, +you must escape it, as described in section~\ref{sec:template:escape}. + +\section{Common template keywords} +\label{sec:template:keyword} + +You can start writing simple templates immediately using the keywords +below. + +\begin{itemize} +\item[\tplkword{author}] String. The unmodified author of the changeset. +\item[\tplkword{branches}] String. The name of the branch on which + the changeset was committed. Will be empty if the branch name was + \texttt{default}. +\item[\tplkword{date}] Date information. The date when the changeset + was committed. This is \emph{not} human-readable; you must pass it + through a filter that will render it appropriately. See + section~\ref{sec:template:filter} for more information on filters. + The date is expressed as a pair of numbers. The first number is a + Unix UTC timestamp (seconds since January 1, 1970); the second is + the offset of the committer's timezone from UTC, in seconds. +\item[\tplkword{desc}] String. The text of the changeset description. +\item[\tplkword{files}] List of strings. All files modified, added, or + removed by this changeset. +\item[\tplkword{file\_adds}] List of strings. Files added by this + changeset. +\item[\tplkword{file\_dels}] List of strings. Files removed by this + changeset. +\item[\tplkword{node}] String. The changeset identification hash, as a + 40-character hexadecimal string. +\item[\tplkword{parents}] List of strings. The parents of the + changeset. +\item[\tplkword{rev}] Integer. The repository-local changeset revision + number. +\item[\tplkword{tags}] List of strings. Any tags associated with the + changeset. +\end{itemize} + +A few simple experiments will show us what to expect when we use these +keywords; you can see the results in +figure~\ref{fig:template:keywords}. + +\begin{figure} + \interaction{template.simple.keywords} + \caption{Template keywords in use} + \label{fig:template:keywords} +\end{figure} + +As we noted above, the date keyword does not produce human-readable +output, so we must treat it specially. This involves using a +\emph{filter}, about which more in section~\ref{sec:template:filter}. + +\interaction{template.simple.datekeyword} + +\section{Escape sequences} +\label{sec:template:escape} + +Mercurial's templating engine recognises the most commonly used escape +sequences in strings. When it sees a backslash (``\Verb+\+'') +character, it looks at the following character and substitutes the two +characters with a single replacement, as described below. + +\begin{itemize} +\item[\Verb+\textbackslash\textbackslash+] Backslash, ``\Verb+\+'', + ASCII~134. +\item[\Verb+\textbackslash n+] Newline, ASCII~12. +\item[\Verb+\textbackslash r+] Carriage return, ASCII~15. +\item[\Verb+\textbackslash t+] Tab, ASCII~11. +\item[\Verb+\textbackslash v+] Vertical tab, ASCII~13. +\item[\Verb+\textbackslash \{+] Open curly brace, ``\Verb+{+'', ASCII~173. +\item[\Verb+\textbackslash \}+] Close curly brace, ``\Verb+}+'', ASCII~175. +\end{itemize} + +As indicated above, if you want the expansion of a template to contain +a literal ``\Verb+\+'', ``\Verb+{+'', or ``\Verb+{+'' character, you +must escape it. + +\section{Filtering keywords to change their results} +\label{sec:template:filter} + +Some of the results of template expansion are not immediately easy to +use. Mercurial lets you specify an optional chain of \emph{filters} +to modify the result of expanding a keyword. You have already seen a +common filter, \tplkwfilt{date}{isodate}, in action above, to make a +date readable. + +Below is a list of the most commonly used filters that Mercurial +supports. While some filters can be applied to any text, others can +only be used in specific circumstances. The name of each filter is +followed first by an indication of where it can be used, then a +description of its effect. + +\begin{itemize} +\item[\tplfilter{addbreaks}] Any text. Add an XHTML ``\Verb+
+'' + tag before the end of every line except the last. For example, + ``\Verb+foo\nbar+'' becomes ``\Verb+foo
\nbar+''. +\item[\tplkwfilt{date}{age}] \tplkword{date} keyword. Render the + age of the date, relative to the current time. Yields a string like + ``\Verb+10 minutes+''. +\item[\tplfilter{basename}] Any text, but most useful for the + \tplkword{files} keyword and its relatives. Treat the text as a + path, and return the basename. For example, ``\Verb+foo/bar/baz+'' + becomes ``\Verb+baz+''. +\item[\tplkwfilt{date}{date}] \tplkword{date} keyword. Render a date + in a similar format to the Unix \tplkword{date} command, but with + timezone included. Yields a string like + ``\Verb+Mon Sep 04 15:13:13 2006 -0700+''. +\item[\tplkwfilt{author}{domain}] Any text, but most useful for the + \tplkword{author} keyword. Finds the first string that looks like + an email address, and extract just the domain component. For + example, ``\Verb+Bryan O'Sullivan +'' becomes + ``\Verb+serpentine.com+''. +\item[\tplkwfilt{author}{email}] Any text, but most useful for the + \tplkword{author} keyword. Extract the first string that looks like + an email address. For example, + ``\Verb+Bryan O'Sullivan +'' becomes + ``\Verb+bos@serpentine.com+''. +\item[\tplfilter{escape}] Any text. Replace the special XML/XHTML + characters ``\Verb+&+'', ``\Verb+<+'' and ``\Verb+>+'' with + XML entities. +\item[\tplfilter{fill68}] Any text. Wrap the text to fit in 68 + columns. This is useful before you pass text through the + \tplfilter{tabindent} filter, and still want it to fit in an + 80-column fixed-font window. +\item[\tplfilter{fill76}] Any text. Wrap the text to fit in 76 + columns. +\item[\tplfilter{firstline}] Any text. Yield the first line of text, + without any trailing newlines. +\item[\tplkwfilt{date}{hgdate}] \tplkword{date} keyword. Render the + date as a pair of readable numbers. Yields a string like + ``\Verb+1157407993 25200+''. +\item[\tplkwfilt{date}{isodate}] \tplkword{date} keyword. Render the + date as a text string in ISO~8601 format. Yields a string like + ``\Verb+2006-09-04 15:13:13 -0700+''. +\item[\tplfilter{obfuscate}] Any text, but most useful for the + \tplkword{author} keyword. Yield the input text rendered as a + sequence of XML entities. This helps to defeat some particularly + stupid screen-scraping email harvesting spambots. +\item[\tplkwfilt{author}{person}] Any text, but most useful for the + \tplkword{author} keyword. Yield the text before an email address. + For example, ``\Verb+Bryan O'Sullivan +'' + becomes ``\Verb+Bryan O'Sullivan+''. +\item[\tplkwfilt{date}{rfc822date}] \tplkword{date} keyword. Render a + date using the same format used in email headers. Yields a string + like ``\Verb+Mon, 04 Sep 2006 15:13:13 -0700+''. +\item[\tplkwfilt{node}{short}] Changeset hash. Yield the short form + of a changeset hash, i.e.~a 12-byte hexadecimal string. +\item[\tplkwfilt{date}{shortdate}] \tplkword{date} keyword. Render + the year, month, and day of the date. Yields a string like + ``\Verb+2006-09-04+''. +\item[\tplfilter{strip}] Any text. Strip all leading and trailing + whitespace from the string. +\item[\tplfilter{tabindent}] Any text. Yield the text, with every line + except the first starting with a tab character. +\item[\tplfilter{urlescape}] Any text. Escape all characters that are + considered ``special'' by URL parsers. For example, \Verb+foo bar+ + becomes \Verb+foo%20bar+. +\item[\tplkwfilt{author}{user}] Any text, but most useful for the + \tplkword{author} keyword. Return the ``user'' portion of an email + address. For example, + ``\Verb+Bryan O'Sullivan +'' becomes + ``\Verb+bos+''. +\end{itemize} + +\begin{figure} + \interaction{template.simple.manyfilters} + \caption{Template filters in action} + \label{fig:template:filters} +\end{figure} + +\begin{note} + If you try to apply a filter to a piece of data that it cannot + process, Mercurial will fail and print a Python exception. For + example, trying to run the output of the \tplkword{desc} keyword + into the \tplkwfilt{date}{isodate} filter is not a good idea. +\end{note} + +\subsection{Combining filters} + +It is easy to combine filters to yield output in the form you would +like. The following chain of filters tidies up a description, then +makes sure that it fits cleanly into 68 columns, then indents it by a +further 8~characters (at least on Unix-like systems, where a tab is +conventionally 8~characters wide). + +\interaction{template.simple.combine} + +Note the use of ``\Verb+\t+'' (a tab character) in the template to +force the first line to be indented; this is necessary since +\tplkword{tabindent} indents all lines \emph{except} the first. + +Keep in mind that the order of filters in a chain is significant. The +first filter is applied to the result of the keyword; the second to +the result of the first filter; and so on. For example, using +\Verb+fill68|tabindent+ gives very different results from +\Verb+tabindent|fill68+. + + +\section{From templates to styles} + +A command line template provides a quick and simple way to format some +output. Templates can become verbose, though, and it's useful to be +able to give a template a name. A style file is a template with a +name, stored in a file. + +More than that, using a style file unlocks the power of Mercurial's +templating engine in ways that are not possible using the command line +\hgopt{log}{--template} option. + +\subsection{The simplest of style files} + +Our simple style file contains just one line: + +\interaction{template.simple.rev} + +This tells Mercurial, ``if you're printing a changeset, use the text +on the right as the template''. + +\subsection{Style file syntax} + +The syntax rules for a style file are simple. + +\begin{itemize} +\item The file is processed one line at a time. + +\item Leading and trailing white space are ignored. + +\item Empty lines are skipped. + +\item If a line starts with either of the characters ``\texttt{\#}'' or + ``\texttt{;}'', the entire line is treated as a comment, and skipped + as if empty. + +\item A line starts with a keyword. This must start with an + alphabetic character or underscore, and can subsequently contain any + alphanumeric character or underscore. (In regexp notation, a + keyword must match \Verb+[A-Za-z_][A-Za-z0-9_]*+.) + +\item The next element must be an ``\texttt{=}'' character, which can + be preceded or followed by an arbitrary amount of white space. + +\item If the rest of the line starts and ends with matching quote + characters (either single or double quote), it is treated as a + template body. + +\item If the rest of the line \emph{does not} start with a quote + character, it is treated as the name of a file; the contents of this + file will be read and used as a template body. +\end{itemize} + +\section{Style files by example} + +To illustrate how to write a style file, we will construct a few by +example. Rather than provide a complete style file and walk through +it, we'll mirror the usual process of developing a style file by +starting with something very simple, and walking through a series of +successively more complete examples. + +\subsection{Identifying mistakes in style files} + +If Mercurial encounters a problem in a style file you are working on, +it prints a terse error message that, once you figure out what it +means, is actually quite useful. + +\interaction{template.svnstyle.syntax.input} + +Notice that \filename{broken.style} attempts to define a +\texttt{changeset} keyword, but forgets to give any content for it. +When instructed to use this style file, Mercurial promptly complains. + +\interaction{template.svnstyle.syntax.error} + +This error message looks intimidating, but it is not too hard to +follow. + +\begin{itemize} +\item The first component is simply Mercurial's way of saying ``I am + giving up''. + \begin{codesample4} + \textbf{abort:} broken.style:1: parse error + \end{codesample4} + +\item Next comes the name of the style file that contains the error. + \begin{codesample4} + abort: \textbf{broken.style}:1: parse error + \end{codesample4} + +\item Following the file name is the line number where the error was + encountered. + \begin{codesample4} + abort: broken.style:\textbf{1}: parse error + \end{codesample4} + +\item Finally, a description of what went wrong. + \begin{codesample4} + abort: broken.style:1: \textbf{parse error} + \end{codesample4} + The description of the problem is not always clear (as in this + case), but even when it is cryptic, it is almost always trivial to + visually inspect the offending line in the style file and see what + is wrong. +\end{itemize} + +\subsection{Uniquely identifying a repository} + +If you would like to be able to identify a Mercurial repository +``fairly uniquely'' using a short string as an identifier, you can +use the first revision in the repository. +\interaction{template.svnstyle.id} +This is not guaranteed to be unique, but it is nevertheless useful in +many cases. +\begin{itemize} +\item It will not work in a completely empty repository, because such + a repository does not have a revision~zero. +\item Neither will it work in the (extremely rare) case where a + repository is a merge of two or more formerly independent + repositories, and you still have those repositories around. +\end{itemize} +Here are some uses to which you could put this identifier: +\begin{itemize} +\item As a key into a table for a database that manages repositories + on a server. +\item As half of a \{\emph{repository~ID}, \emph{revision~ID}\} tuple. + Save this information away when you run an automated build or other + activity, so that you can ``replay'' the build later if necessary. +\end{itemize} + +\subsection{Mimicking Subversion's output} + +Let's try to emulate the default output format used by another +revision control tool, Subversion. +\interaction{template.svnstyle.short} + +Since Subversion's output style is fairly simple, it is easy to +copy-and-paste a hunk of its output into a file, and replace the text +produced above by Subversion with the template values we'd like to see +expanded. +\interaction{template.svnstyle.template} + +There are a few small ways in which this template deviates from the +output produced by Subversion. +\begin{itemize} +\item Subversion prints a ``readable'' date (the ``\texttt{Wed, 27 Sep + 2006}'' in the example output above) in parentheses. Mercurial's + templating engine does not provide a way to display a date in this + format without also printing the time and time zone. +\item We emulate Subversion's printing of ``separator'' lines full of + ``\texttt{-}'' characters by ending the template with such a line. + We use the templating engine's \tplkword{header} keyword to print a + separator line as the first line of output (see below), thus + achieving similar output to Subversion. +\item Subversion's output includes a count in the header of the number + of lines in the commit message. We cannot replicate this in + Mercurial; the templating engine does not currently provide a filter + that counts the number of items it is passed. +\end{itemize} +It took me no more than a minute or two of work to replace literal +text from an example of Subversion's output with some keywords and +filters to give the template above. The style file simply refers to +the template. +\interaction{template.svnstyle.style} + +We could have included the text of the template file directly in the +style file by enclosing it in quotes and replacing the newlines with +``\verb!\n!'' sequences, but it would have made the style file too +difficult to read. Readability is a good guide when you're trying to +decide whether some text belongs in a style file, or in a template +file that the style file points to. If the style file will look too +big or cluttered if you insert a literal piece of text, drop it into a +template instead. + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: