# HG changeset patch # User jerojasro@localhost # Date 1224362681 18000 # Node ID 7e52f0cc451618ddef234c425c1e95e6e69ccef9 # Parent 7f0af73f53ab94cf345875b6090478f7a0770c1f changed es/hgext.tex changed es/hook.tex changed es/kdiff3.png changed es/license.tex changed es/mq-collab.tex changed es/mq-ref.tex changed es/mq.tex changed es/note.png changed es/tour-merge.tex changed es/undo-manual-merge.dot changed es/undo-non-tip.dot files needed to compile the pdf version of the book. diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.commits.out --- a/en/examples/bisect.commits.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.commits.out Sat Oct 18 15:44:41 2008 -0500 @@ -1,16 +1,3 @@ - - - - - - - - - - - - - @@ -21,25 +8,3 @@ - - - - - - - - - - - - - - - - - - - - - - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.help.out --- a/en/examples/bisect.help.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.help.out Sat Oct 18 15:44:41 2008 -0500 @@ -21,4 +21,3 @@ - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.init.out --- a/en/examples/bisect.init.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.init.out Sat Oct 18 15:44:41 2008 -0500 @@ -1,3 +1,2 @@ - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.search.bad-init.out --- a/en/examples/bisect.search.bad-init.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.search.bad-init.out Sat Oct 18 15:44:41 2008 -0500 @@ -1,2 +1,1 @@ - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.search.good-init.out --- a/en/examples/bisect.search.good-init.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.search.good-init.out Sat Oct 18 15:44:41 2008 -0500 @@ -1,4 +1,3 @@ - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.search.init.out --- a/en/examples/bisect.search.init.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.search.init.out Sat Oct 18 15:44:41 2008 -0500 @@ -1,25 +1,5 @@ - - - - - - - - - - - - - - - - - - - - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.search.reset.out --- a/en/examples/bisect.search.reset.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.search.reset.out Sat Oct 18 15:44:41 2008 -0500 @@ -1,2 +1,1 @@ - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.search.rest.out --- a/en/examples/bisect.search.rest.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.search.rest.out Sat Oct 18 15:44:41 2008 -0500 @@ -14,6 +14,3 @@ - - - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.search.step1.out --- a/en/examples/bisect.search.step1.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.search.step1.out Sat Oct 18 15:44:41 2008 -0500 @@ -9,4 +9,3 @@ - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/bisect.search.step2.out --- a/en/examples/bisect.search.step2.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/bisect.search.step2.out Sat Oct 18 15:44:41 2008 -0500 @@ -2,4 +2,3 @@ - diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/branch-repo.bugfix.out --- a/en/examples/branch-repo.bugfix.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/branch-repo.bugfix.out Sat Oct 18 15:44:41 2008 -0500 @@ -5,7 +5,7 @@ $ \textbf{echo 'I fixed a bug using only echo!' >> myfile} $ \textbf{hg commit -m 'Important fix for 1.0.1'} $ \textbf{hg push} -pushing to /tmp/branch-repo4rF-PL/myproject-1.0.1 +pushing to searching for changes adding changesets adding manifests diff -r 7f0af73f53ab -r 7e52f0cc4516 en/examples/branching.stable.out --- a/en/examples/branching.stable.out Sat Oct 18 14:35:43 2008 -0500 +++ b/en/examples/branching.stable.out Sat Oct 18 15:44:41 2008 -0500 @@ -5,7 +5,7 @@ $ \textbf{echo 'This is a fix to a boring feature.' > myfile} $ \textbf{hg commit -m 'Fix a bug'} $ \textbf{hg push} -pushing to /tmp/branchingfJgZac/stable +pushing to searching for changes adding changesets adding manifests diff -r 7f0af73f53ab -r 7e52f0cc4516 es/hgext.tex --- a/es/hgext.tex Sat Oct 18 14:35:43 2008 -0500 +++ b/es/hgext.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,429 @@ +\chapter{Adding functionality with extensions} +\label{chap:hgext} + +While the core of Mercurial is quite complete from a functionality +standpoint, it's deliberately shorn of fancy features. This approach +of preserving simplicity keeps the software easy to deal with for both +maintainers and users. + +However, Mercurial doesn't box you in with an inflexible command set: +you can add features to it as \emph{extensions} (sometimes known as +\emph{plugins}). We've already discussed a few of these extensions in +earlier chapters. +\begin{itemize} +\item Section~\ref{sec:tour-merge:fetch} covers the \hgext{fetch} + extension; this combines pulling new changes and merging them with + local changes into a single command, \hgxcmd{fetch}{fetch}. +\item In chapter~\ref{chap:hook}, we covered several extensions that + are useful for hook-related functionality: \hgext{acl} adds access + control lists; \hgext{bugzilla} adds integration with the Bugzilla + bug tracking system; and \hgext{notify} sends notification emails on + new changes. +\item The Mercurial Queues patch management extension is so invaluable + that it merits two chapters and an appendix all to itself. + Chapter~\ref{chap:mq} covers the basics; + chapter~\ref{chap:mq-collab} discusses advanced topics; and + appendix~\ref{chap:mqref} goes into detail on each command. +\end{itemize} + +In this chapter, we'll cover some of the other extensions that are +available for Mercurial, and briefly touch on some of the machinery +you'll need to know about if you want to write an extension of your +own. +\begin{itemize} +\item In section~\ref{sec:hgext:inotify}, we'll discuss the + possibility of \emph{huge} performance improvements using the + \hgext{inotify} extension. +\end{itemize} + +\section{Improve performance with the \hgext{inotify} extension} +\label{sec:hgext:inotify} + +Are you interested in having some of the most common Mercurial +operations run as much as a hundred times faster? Read on! + +Mercurial has great performance under normal circumstances. For +example, when you run the \hgcmd{status} command, Mercurial has to +scan almost every directory and file in your repository so that it can +display file status. Many other Mercurial commands need to do the +same work behind the scenes; for example, the \hgcmd{diff} command +uses the status machinery to avoid doing an expensive comparison +operation on files that obviously haven't changed. + +Because obtaining file status is crucial to good performance, the +authors of Mercurial have optimised this code to within an inch of its +life. However, there's no avoiding the fact that when you run +\hgcmd{status}, Mercurial is going to have to perform at least one +expensive system call for each managed file to determine whether it's +changed since the last time Mercurial checked. For a sufficiently +large repository, this can take a long time. + +To put a number on the magnitude of this effect, I created a +repository containing 150,000 managed files. I timed \hgcmd{status} +as taking ten seconds to run, even when \emph{none} of those files had +been modified. + +Many modern operating systems contain a file notification facility. +If a program signs up to an appropriate service, the operating system +will notify it every time a file of interest is created, modified, or +deleted. On Linux systems, the kernel component that does this is +called \texttt{inotify}. + +Mercurial's \hgext{inotify} extension talks to the kernel's +\texttt{inotify} component to optimise \hgcmd{status} commands. The +extension has two components. A daemon sits in the background and +receives notifications from the \texttt{inotify} subsystem. It also +listens for connections from a regular Mercurial command. The +extension modifies Mercurial's behaviour so that instead of scanning +the filesystem, it queries the daemon. Since the daemon has perfect +information about the state of the repository, it can respond with a +result instantaneously, avoiding the need to scan every directory and +file in the repository. + +Recall the ten seconds that I measured plain Mercurial as taking to +run \hgcmd{status} on a 150,000 file repository. With the +\hgext{inotify} extension enabled, the time dropped to 0.1~seconds, a +factor of \emph{one hundred} faster. + +Before we continue, please pay attention to some caveats. +\begin{itemize} +\item The \hgext{inotify} extension is Linux-specific. Because it + interfaces directly to the Linux kernel's \texttt{inotify} + subsystem, it does not work on other operating systems. +\item It should work on any Linux distribution that was released after + early~2005. Older distributions are likely to have a kernel that + lacks \texttt{inotify}, or a version of \texttt{glibc} that does not + have the necessary interfacing support. +\item Not all filesystems are suitable for use with the + \hgext{inotify} extension. Network filesystems such as NFS are a + non-starter, for example, particularly if you're running Mercurial + on several systems, all mounting the same network filesystem. The + kernel's \texttt{inotify} system has no way of knowing about changes + made on another system. Most local filesystems (e.g.~ext3, XFS, + ReiserFS) should work fine. +\end{itemize} + +The \hgext{inotify} extension is not yet shipped with Mercurial as of +May~2007, so it's a little more involved to set up than other +extensions. But the performance improvement is worth it! + +The extension currently comes in two parts: a set of patches to the +Mercurial source code, and a library of Python bindings to the +\texttt{inotify} subsystem. +\begin{note} + There are \emph{two} Python \texttt{inotify} binding libraries. One + of them is called \texttt{pyinotify}, and is packaged by some Linux + distributions as \texttt{python-inotify}. This is \emph{not} the + one you'll need, as it is too buggy and inefficient to be practical. +\end{note} +To get going, it's best to already have a functioning copy of +Mercurial installed. +\begin{note} + If you follow the instructions below, you'll be \emph{replacing} and + overwriting any existing installation of Mercurial that you might + already have, using the latest ``bleeding edge'' Mercurial code. + Don't say you weren't warned! +\end{note} +\begin{enumerate} +\item Clone the Python \texttt{inotify} binding repository. Build and + install it. + \begin{codesample4} + hg clone http://hg.kublai.com/python/inotify + cd inotify + python setup.py build --force + sudo python setup.py install --skip-build + \end{codesample4} +\item Clone the \dirname{crew} Mercurial repository. Clone the + \hgext{inotify} patch repository so that Mercurial Queues will be + able to apply patches to your cope of the \dirname{crew} repository. + \begin{codesample4} + hg clone http://hg.intevation.org/mercurial/crew + hg clone crew inotify + hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches + \end{codesample4} +\item Make sure that you have the Mercurial Queues extension, + \hgext{mq}, enabled. If you've never used MQ, read + section~\ref{sec:mq:start} to get started quickly. +\item Go into the \dirname{inotify} repo, and apply all of the + \hgext{inotify} patches using the \hgxopt{mq}{qpush}{-a} option to + the \hgxcmd{mq}{qpush} command. + \begin{codesample4} + cd inotify + hg qpush -a + \end{codesample4} + If you get an error message from \hgxcmd{mq}{qpush}, you should not + continue. Instead, ask for help. +\item Build and install the patched version of Mercurial. + \begin{codesample4} + python setup.py build --force + sudo python setup.py install --skip-build + \end{codesample4} +\end{enumerate} +Once you've build a suitably patched version of Mercurial, all you +need to do to enable the \hgext{inotify} extension is add an entry to +your \hgrc. +\begin{codesample2} + [extensions] + inotify = +\end{codesample2} +When the \hgext{inotify} extension is enabled, Mercurial will +automatically and transparently start the status daemon the first time +you run a command that needs status in a repository. It runs one +status daemon per repository. + +The status daemon is started silently, and runs in the background. If +you look at a list of running processes after you've enabled the +\hgext{inotify} extension and run a few commands in different +repositories, you'll thus see a few \texttt{hg} processes sitting +around, waiting for updates from the kernel and queries from +Mercurial. + +The first time you run a Mercurial command in a repository when you +have the \hgext{inotify} extension enabled, it will run with about the +same performance as a normal Mercurial command. This is because the +status daemon needs to perform a normal status scan so that it has a +baseline against which to apply later updates from the kernel. +However, \emph{every} subsequent command that does any kind of status +check should be noticeably faster on repositories of even fairly +modest size. Better yet, the bigger your repository is, the greater a +performance advantage you'll see. The \hgext{inotify} daemon makes +status operations almost instantaneous on repositories of all sizes! + +If you like, you can manually start a status daemon using the +\hgxcmd{inotify}{inserve} command. This gives you slightly finer +control over how the daemon ought to run. This command will of course +only be available when the \hgext{inotify} extension is enabled. + +When you're using the \hgext{inotify} extension, you should notice +\emph{no difference at all} in Mercurial's behaviour, with the sole +exception of status-related commands running a whole lot faster than +they used to. You should specifically expect that commands will not +print different output; neither should they give different results. +If either of these situations occurs, please report a bug. + +\section{Flexible diff support with the \hgext{extdiff} extension} +\label{sec:hgext:extdiff} + +Mercurial's built-in \hgcmd{diff} command outputs plaintext unified +diffs. +\interaction{extdiff.diff} +If you would like to use an external tool to display modifications, +you'll want to use the \hgext{extdiff} extension. This will let you +use, for example, a graphical diff tool. + +The \hgext{extdiff} extension is bundled with Mercurial, so it's easy +to set up. In the \rcsection{extensions} section of your \hgrc, +simply add a one-line entry to enable the extension. +\begin{codesample2} + [extensions] + extdiff = +\end{codesample2} +This introduces a command named \hgxcmd{extdiff}{extdiff}, which by +default uses your system's \command{diff} command to generate a +unified diff in the same form as the built-in \hgcmd{diff} command. +\interaction{extdiff.extdiff} +The result won't be exactly the same as with the built-in \hgcmd{diff} +variations, because the output of \command{diff} varies from one +system to another, even when passed the same options. + +As the ``\texttt{making snapshot}'' lines of output above imply, the +\hgxcmd{extdiff}{extdiff} command works by creating two snapshots of +your source tree. The first snapshot is of the source revision; the +second, of the target revision or working directory. The +\hgxcmd{extdiff}{extdiff} command generates these snapshots in a +temporary directory, passes the name of each directory to an external +diff viewer, then deletes the temporary directory. For efficiency, it +only snapshots the directories and files that have changed between the +two revisions. + +Snapshot directory names have the same base name as your repository. +If your repository path is \dirname{/quux/bar/foo}, then \dirname{foo} +will be the name of each snapshot directory. Each snapshot directory +name has its changeset ID appended, if appropriate. If a snapshot is +of revision \texttt{a631aca1083f}, the directory will be named +\dirname{foo.a631aca1083f}. A snapshot of the working directory won't +have a changeset ID appended, so it would just be \dirname{foo} in +this example. To see what this looks like in practice, look again at +the \hgxcmd{extdiff}{extdiff} example above. Notice that the diff has +the snapshot directory names embedded in its header. + +The \hgxcmd{extdiff}{extdiff} command accepts two important options. +The \hgxopt{extdiff}{extdiff}{-p} option lets you choose a program to +view differences with, instead of \command{diff}. With the +\hgxopt{extdiff}{extdiff}{-o} option, you can change the options that +\hgxcmd{extdiff}{extdiff} passes to the program (by default, these +options are ``\texttt{-Npru}'', which only make sense if you're +running \command{diff}). In other respects, the +\hgxcmd{extdiff}{extdiff} command acts similarly to the built-in +\hgcmd{diff} command: you use the same option names, syntax, and +arguments to specify the revisions you want, the files you want, and +so on. + +As an example, here's how to run the normal system \command{diff} +command, getting it to generate context diffs (using the +\cmdopt{diff}{-c} option) instead of unified diffs, and five lines of +context instead of the default three (passing \texttt{5} as the +argument to the \cmdopt{diff}{-C} option). +\interaction{extdiff.extdiff-ctx} + +Launching a visual diff tool is just as easy. Here's how to launch +the \command{kdiff3} viewer. +\begin{codesample2} + hg extdiff -p kdiff3 -o '' +\end{codesample2} + +If your diff viewing command can't deal with directories, you can +easily work around this with a little scripting. For an example of +such scripting in action with the \hgext{mq} extension and the +\command{interdiff} command, see +section~\ref{mq-collab:tips:interdiff}. + +\subsection{Defining command aliases} + +It can be cumbersome to remember the options to both the +\hgxcmd{extdiff}{extdiff} command and the diff viewer you want to use, +so the \hgext{extdiff} extension lets you define \emph{new} commands +that will invoke your diff viewer with exactly the right options. + +All you need to do is edit your \hgrc, and add a section named +\rcsection{extdiff}. Inside this section, you can define multiple +commands. Here's how to add a \texttt{kdiff3} command. Once you've +defined this, you can type ``\texttt{hg kdiff3}'' and the +\hgext{extdiff} extension will run \command{kdiff3} for you. +\begin{codesample2} + [extdiff] + cmd.kdiff3 = +\end{codesample2} +If you leave the right hand side of the definition empty, as above, +the \hgext{extdiff} extension uses the name of the command you defined +as the name of the external program to run. But these names don't +have to be the same. Here, we define a command named ``\texttt{hg + wibble}'', which runs \command{kdiff3}. +\begin{codesample2} + [extdiff] + cmd.wibble = kdiff3 +\end{codesample2} + +You can also specify the default options that you want to invoke your +diff viewing program with. The prefix to use is ``\texttt{opts.}'', +followed by the name of the command to which the options apply. This +example defines a ``\texttt{hg vimdiff}'' command that runs the +\command{vim} editor's \texttt{DirDiff} extension. +\begin{codesample2} + [extdiff] + cmd.vimdiff = vim + opts.vimdiff = -f '+next' '+execute "DirDiff" argv(0) argv(1)' +\end{codesample2} + +\section{Cherrypicking changes with the \hgext{transplant} extension} +\label{sec:hgext:transplant} + +Need to have a long chat with Brendan about this. + +\section{Send changes via email with the \hgext{patchbomb} extension} +\label{sec:hgext:patchbomb} + +Many projects have a culture of ``change review'', in which people +send their modifications to a mailing list for others to read and +comment on before they commit the final version to a shared +repository. Some projects have people who act as gatekeepers; they +apply changes from other people to a repository to which those others +don't have access. + +Mercurial makes it easy to send changes over email for review or +application, via its \hgext{patchbomb} extension. The extension is so +namd because changes are formatted as patches, and it's usual to send +one changeset per email message. Sending a long series of changes by +email is thus much like ``bombing'' the recipient's inbox, hence +``patchbomb''. + +As usual, the basic configuration of the \hgext{patchbomb} extension +takes just one or two lines in your \hgrc. +\begin{codesample2} + [extensions] + patchbomb = +\end{codesample2} +Once you've enabled the extension, you will have a new command +available, named \hgxcmd{patchbomb}{email}. + +The safest and best way to invoke the \hgxcmd{patchbomb}{email} +command is to \emph{always} run it first with the +\hgxopt{patchbomb}{email}{-n} option. This will show you what the +command \emph{would} send, without actually sending anything. Once +you've had a quick glance over the changes and verified that you are +sending the right ones, you can rerun the same command, with the +\hgxopt{patchbomb}{email}{-n} option removed. + +The \hgxcmd{patchbomb}{email} command accepts the same kind of +revision syntax as every other Mercurial command. For example, this +command will send every revision between 7 and \texttt{tip}, +inclusive. +\begin{codesample2} + hg email -n 7:tip +\end{codesample2} +You can also specify a \emph{repository} to compare with. If you +provide a repository but no revisions, the \hgxcmd{patchbomb}{email} +command will send all revisions in the local repository that are not +present in the remote repository. If you additionally specify +revisions or a branch name (the latter using the +\hgxopt{patchbomb}{email}{-b} option), this will constrain the +revisions sent. + +It's perfectly safe to run the \hgxcmd{patchbomb}{email} command +without the names of the people you want to send to: if you do this, +it will just prompt you for those values interactively. (If you're +using a Linux or Unix-like system, you should have enhanced +\texttt{readline}-style editing capabilities when entering those +headers, too, which is useful.) + +When you are sending just one revision, the \hgxcmd{patchbomb}{email} +command will by default use the first line of the changeset +description as the subject of the single email message it sends. + +If you send multiple revisions, the \hgxcmd{patchbomb}{email} command +will usually send one message per changeset. It will preface the +series with an introductory message, in which you should describe the +purpose of the series of changes you're sending. + +\subsection{Changing the behaviour of patchbombs} + +Not every project has exactly the same conventions for sending changes +in email; the \hgext{patchbomb} extension tries to accommodate a +number of variations through command line options. +\begin{itemize} +\item You can write a subject for the introductory message on the + command line using the \hgxopt{patchbomb}{email}{-s} option. This + takes one argument, the text of the subject to use. +\item To change the email address from which the messages originate, + use the \hgxopt{patchbomb}{email}{-f} option. This takes one + argument, the email address to use. +\item The default behaviour is to send unified diffs (see + section~\ref{sec:mq:patch} for a description of the format), one per + message. You can send a binary bundle instead with the + \hgxopt{patchbomb}{email}{-b} option. +\item Unified diffs are normally prefaced with a metadata header. You + can omit this, and send unadorned diffs, with the + \hgxopt{patchbomb}{email}{--plain} option. +\item Diffs are normally sent ``inline'', in the same body part as the + description of a patch. This makes it easiest for the largest + number of readers to quote and respond to parts of a diff, as some + mail clients will only quote the first MIME body part in a message. + If you'd prefer to send the description and the diff in separate + body parts, use the \hgxopt{patchbomb}{email}{-a} option. +\item Instead of sending mail messages, you can write them to an + \texttt{mbox}-format mail folder using the + \hgxopt{patchbomb}{email}{-m} option. That option takes one + argument, the name of the file to write to. +\item If you would like to add a \command{diffstat}-format summary to + each patch, and one to the introductory message, use the + \hgxopt{patchbomb}{email}{-d} option. The \command{diffstat} + command displays a table containing the name of each file patched, + the number of lines affected, and a histogram showing how much each + file is modified. This gives readers a qualitative glance at how + complex a patch is. +\end{itemize} + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 7f0af73f53ab -r 7e52f0cc4516 es/hook.tex --- a/es/hook.tex Sat Oct 18 14:35:43 2008 -0500 +++ b/es/hook.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,1413 @@ +\chapter{Handling repository events with hooks} +\label{chap:hook} + +Mercurial offers a powerful mechanism to let you perform automated +actions in response to events that occur in a repository. In some +cases, you can even control Mercurial's response to those events. + +The name Mercurial uses for one of these actions is a \emph{hook}. +Hooks are called ``triggers'' in some revision control systems, but +the two names refer to the same idea. + +\section{An overview of hooks in Mercurial} + +Here is a brief list of the hooks that Mercurial supports. We will +revisit each of these hooks in more detail later, in +section~\ref{sec:hook:ref}. + +\begin{itemize} +\item[\small\hook{changegroup}] This is run after a group of + changesets has been brought into the repository from elsewhere. +\item[\small\hook{commit}] This is run after a new changeset has been + created in the local repository. +\item[\small\hook{incoming}] This is run once for each new changeset + that is brought into the repository from elsewhere. Notice the + difference from \hook{changegroup}, which is run once per + \emph{group} of changesets brought in. +\item[\small\hook{outgoing}] This is run after a group of changesets + has been transmitted from this repository. +\item[\small\hook{prechangegroup}] This is run before starting to + bring a group of changesets into the repository. +\item[\small\hook{precommit}] Controlling. This is run before starting + a commit. +\item[\small\hook{preoutgoing}] Controlling. This is run before + starting to transmit a group of changesets from this repository. +\item[\small\hook{pretag}] Controlling. This is run before creating a tag. +\item[\small\hook{pretxnchangegroup}] Controlling. This is run after a + group of changesets has been brought into the local repository from + another, but before the transaction completes that will make the + changes permanent in the repository. +\item[\small\hook{pretxncommit}] Controlling. This is run after a new + changeset has been created in the local repository, but before the + transaction completes that will make it permanent. +\item[\small\hook{preupdate}] Controlling. This is run before starting + an update or merge of the working directory. +\item[\small\hook{tag}] This is run after a tag is created. +\item[\small\hook{update}] This is run after an update or merge of the + working directory has finished. +\end{itemize} +Each of the hooks whose description begins with the word +``Controlling'' has the ability to determine whether an activity can +proceed. If the hook succeeds, the activity may proceed; if it fails, +the activity is either not permitted or undone, depending on the hook. + +\section{Hooks and security} + +\subsection{Hooks are run with your privileges} + +When you run a Mercurial command in a repository, and the command +causes a hook to run, that hook runs on \emph{your} system, under +\emph{your} user account, with \emph{your} privilege level. Since +hooks are arbitrary pieces of executable code, you should treat them +with an appropriate level of suspicion. Do not install a hook unless +you are confident that you know who created it and what it does. + +In some cases, you may be exposed to hooks that you did not install +yourself. If you work with Mercurial on an unfamiliar system, +Mercurial will run hooks defined in that system's global \hgrc\ file. + +If you are working with a repository owned by another user, Mercurial +can run hooks defined in that user's repository, but it will still run +them as ``you''. For example, if you \hgcmd{pull} from that +repository, and its \sfilename{.hg/hgrc} defines a local +\hook{outgoing} hook, that hook will run under your user account, even +though you don't own that repository. + +\begin{note} + This only applies if you are pulling from a repository on a local or + network filesystem. If you're pulling over http or ssh, any + \hook{outgoing} hook will run under whatever account is executing + the server process, on the server. +\end{note} + +XXX To see what hooks are defined in a repository, use the +\hgcmdargs{config}{hooks} command. If you are working in one +repository, but talking to another that you do not own (e.g.~using +\hgcmd{pull} or \hgcmd{incoming}), remember that it is the other +repository's hooks you should be checking, not your own. + +\subsection{Hooks do not propagate} + +In Mercurial, hooks are not revision controlled, and do not propagate +when you clone, or pull from, a repository. The reason for this is +simple: a hook is a completely arbitrary piece of executable code. It +runs under your user identity, with your privilege level, on your +machine. + +It would be extremely reckless for any distributed revision control +system to implement revision-controlled hooks, as this would offer an +easily exploitable way to subvert the accounts of users of the +revision control system. + +Since Mercurial does not propagate hooks, if you are collaborating +with other people on a common project, you should not assume that they +are using the same Mercurial hooks as you are, or that theirs are +correctly configured. You should document the hooks you expect people +to use. + +In a corporate intranet, this is somewhat easier to control, as you +can for example provide a ``standard'' installation of Mercurial on an +NFS filesystem, and use a site-wide \hgrc\ file to define hooks that +all users will see. However, this too has its limits; see below. + +\subsection{Hooks can be overridden} + +Mercurial allows you to override a hook definition by redefining the +hook. You can disable it by setting its value to the empty string, or +change its behaviour as you wish. + +If you deploy a system-~or site-wide \hgrc\ file that defines some +hooks, you should thus understand that your users can disable or +override those hooks. + +\subsection{Ensuring that critical hooks are run} + +Sometimes you may want to enforce a policy that you do not want others +to be able to work around. For example, you may have a requirement +that every changeset must pass a rigorous set of tests. Defining this +requirement via a hook in a site-wide \hgrc\ won't work for remote +users on laptops, and of course local users can subvert it at will by +overriding the hook. + +Instead, you can set up your policies for use of Mercurial so that +people are expected to propagate changes through a well-known +``canonical'' server that you have locked down and configured +appropriately. + +One way to do this is via a combination of social engineering and +technology. Set up a restricted-access account; users can push +changes over the network to repositories managed by this account, but +they cannot log into the account and run normal shell commands. In +this scenario, a user can commit a changeset that contains any old +garbage they want. + +When someone pushes a changeset to the server that everyone pulls +from, the server will test the changeset before it accepts it as +permanent, and reject it if it fails to pass the test suite. If +people only pull changes from this filtering server, it will serve to +ensure that all changes that people pull have been automatically +vetted. + +\section{Care with \texttt{pretxn} hooks in a shared-access repository} + +If you want to use hooks to do some automated work in a repository +that a number of people have shared access to, you need to be careful +in how you do this. + +Mercurial only locks a repository when it is writing to the +repository, and only the parts of Mercurial that write to the +repository pay attention to locks. Write locks are necessary to +prevent multiple simultaneous writers from scribbling on each other's +work, corrupting the repository. + +Because Mercurial is careful with the order in which it reads and +writes data, it does not need to acquire a lock when it wants to read +data from the repository. The parts of Mercurial that read from the +repository never pay attention to locks. This lockless reading scheme +greatly increases performance and concurrency. + +With great performance comes a trade-off, though, one which has the +potential to cause you trouble unless you're aware of it. To describe +this requires a little detail about how Mercurial adds changesets to a +repository and reads those changes. + +When Mercurial \emph{writes} metadata, it writes it straight into the +destination file. It writes file data first, then manifest data +(which contains pointers to the new file data), then changelog data +(which contains pointers to the new manifest data). Before the first +write to each file, it stores a record of where the end of the file +was in its transaction log. If the transaction must be rolled back, +Mercurial simply truncates each file back to the size it was before the +transaction began. + +When Mercurial \emph{reads} metadata, it reads the changelog first, +then everything else. Since a reader will only access parts of the +manifest or file metadata that it can see in the changelog, it can +never see partially written data. + +Some controlling hooks (\hook{pretxncommit} and +\hook{pretxnchangegroup}) run when a transaction is almost complete. +All of the metadata has been written, but Mercurial can still roll the +transaction back and cause the newly-written data to disappear. + +If one of these hooks runs for long, it opens a window of time during +which a reader can see the metadata for changesets that are not yet +permanent, and should not be thought of as ``really there''. The +longer the hook runs, the longer that window is open. + +\subsection{The problem illustrated} + +In principle, a good use for the \hook{pretxnchangegroup} hook would +be to automatically build and test incoming changes before they are +accepted into a central repository. This could let you guarantee that +nobody can push changes to this repository that ``break the build''. +But if a client can pull changes while they're being tested, the +usefulness of the test is zero; an unsuspecting someone can pull +untested changes, potentially breaking their build. + +The safest technological answer to this challenge is to set up such a +``gatekeeper'' repository as \emph{unidirectional}. Let it take +changes pushed in from the outside, but do not allow anyone to pull +changes from it (use the \hook{preoutgoing} hook to lock it down). +Configure a \hook{changegroup} hook so that if a build or test +succeeds, the hook will push the new changes out to another repository +that people \emph{can} pull from. + +In practice, putting a centralised bottleneck like this in place is +not often a good idea, and transaction visibility has nothing to do +with the problem. As the size of a project---and the time it takes to +build and test---grows, you rapidly run into a wall with this ``try +before you buy'' approach, where you have more changesets to test than +time in which to deal with them. The inevitable result is frustration +on the part of all involved. + +An approach that scales better is to get people to build and test +before they push, then run automated builds and tests centrally +\emph{after} a push, to be sure all is well. The advantage of this +approach is that it does not impose a limit on the rate at which the +repository can accept changes. + +\section{A short tutorial on using hooks} +\label{sec:hook:simple} + +It is easy to write a Mercurial hook. Let's start with a hook that +runs when you finish a \hgcmd{commit}, and simply prints the hash of +the changeset you just created. The hook is called \hook{commit}. + +\begin{figure}[ht] + \interaction{hook.simple.init} + \caption{A simple hook that runs when a changeset is committed} + \label{ex:hook:init} +\end{figure} + +All hooks follow the pattern in example~\ref{ex:hook:init}. You add +an entry to the \rcsection{hooks} section of your \hgrc. On the left +is the name of the event to trigger on; on the right is the action to +take. As you can see, you can run an arbitrary shell command in a +hook. Mercurial passes extra information to the hook using +environment variables (look for \envar{HG\_NODE} in the example). + +\subsection{Performing multiple actions per event} + +Quite often, you will want to define more than one hook for a +particular kind of event, as shown in example~\ref{ex:hook:ext}. +Mercurial lets you do this by adding an \emph{extension} to the end of +a hook's name. You extend a hook's name by giving the name of the +hook, followed by a full stop (the ``\texttt{.}'' character), followed +by some more text of your choosing. For example, Mercurial will run +both \texttt{commit.foo} and \texttt{commit.bar} when the +\texttt{commit} event occurs. + +\begin{figure}[ht] + \interaction{hook.simple.ext} + \caption{Defining a second \hook{commit} hook} + \label{ex:hook:ext} +\end{figure} + +To give a well-defined order of execution when there are multiple +hooks defined for an event, Mercurial sorts hooks by extension, and +executes the hook commands in this sorted order. In the above +example, it will execute \texttt{commit.bar} before +\texttt{commit.foo}, and \texttt{commit} before both. + +It is a good idea to use a somewhat descriptive extension when you +define a new hook. This will help you to remember what the hook was +for. If the hook fails, you'll get an error message that contains the +hook name and extension, so using a descriptive extension could give +you an immediate hint as to why the hook failed (see +section~\ref{sec:hook:perm} for an example). + +\subsection{Controlling whether an activity can proceed} +\label{sec:hook:perm} + +In our earlier examples, we used the \hook{commit} hook, which is +run after a commit has completed. This is one of several Mercurial +hooks that run after an activity finishes. Such hooks have no way of +influencing the activity itself. + +Mercurial defines a number of events that occur before an activity +starts; or after it starts, but before it finishes. Hooks that +trigger on these events have the added ability to choose whether the +activity can continue, or will abort. + +The \hook{pretxncommit} hook runs after a commit has all but +completed. In other words, the metadata representing the changeset +has been written out to disk, but the transaction has not yet been +allowed to complete. The \hook{pretxncommit} hook has the ability to +decide whether the transaction can complete, or must be rolled back. + +If the \hook{pretxncommit} hook exits with a status code of zero, the +transaction is allowed to complete; the commit finishes; and the +\hook{commit} hook is run. If the \hook{pretxncommit} hook exits with +a non-zero status code, the transaction is rolled back; the metadata +representing the changeset is erased; and the \hook{commit} hook is +not run. + +\begin{figure}[ht] + \interaction{hook.simple.pretxncommit} + \caption{Using the \hook{pretxncommit} hook to control commits} + \label{ex:hook:pretxncommit} +\end{figure} + +The hook in example~\ref{ex:hook:pretxncommit} checks that a commit +comment contains a bug ID. If it does, the commit can complete. If +not, the commit is rolled back. + +\section{Writing your own hooks} + +When you are writing a hook, you might find it useful to run Mercurial +either with the \hggopt{-v} option, or the \rcitem{ui}{verbose} config +item set to ``true''. When you do so, Mercurial will print a message +before it calls each hook. + +\subsection{Choosing how your hook should run} +\label{sec:hook:lang} + +You can write a hook either as a normal program---typically a shell +script---or as a Python function that is executed within the Mercurial +process. + +Writing a hook as an external program has the advantage that it +requires no knowledge of Mercurial's internals. You can call normal +Mercurial commands to get any added information you need. The +trade-off is that external hooks are slower than in-process hooks. + +An in-process Python hook has complete access to the Mercurial API, +and does not ``shell out'' to another process, so it is inherently +faster than an external hook. It is also easier to obtain much of the +information that a hook requires by using the Mercurial API than by +running Mercurial commands. + +If you are comfortable with Python, or require high performance, +writing your hooks in Python may be a good choice. However, when you +have a straightforward hook to write and you don't need to care about +performance (probably the majority of hooks), a shell script is +perfectly fine. + +\subsection{Hook parameters} +\label{sec:hook:param} + +Mercurial calls each hook with a set of well-defined parameters. In +Python, a parameter is passed as a keyword argument to your hook +function. For an external program, a parameter is passed as an +environment variable. + +Whether your hook is written in Python or as a shell script, the +hook-specific parameter names and values will be the same. A boolean +parameter will be represented as a boolean value in Python, but as the +number 1 (for ``true'') or 0 (for ``false'') as an environment +variable for an external hook. If a hook parameter is named +\texttt{foo}, the keyword argument for a Python hook will also be +named \texttt{foo}, while the environment variable for an external +hook will be named \texttt{HG\_FOO}. + +\subsection{Hook return values and activity control} + +A hook that executes successfully must exit with a status of zero if +external, or return boolean ``false'' if in-process. Failure is +indicated with a non-zero exit status from an external hook, or an +in-process hook returning boolean ``true''. If an in-process hook +raises an exception, the hook is considered to have failed. + +For a hook that controls whether an activity can proceed, zero/false +means ``allow'', while non-zero/true/exception means ``deny''. + +\subsection{Writing an external hook} + +When you define an external hook in your \hgrc\ and the hook is run, +its value is passed to your shell, which interprets it. This means +that you can use normal shell constructs in the body of the hook. + +An executable hook is always run with its current directory set to a +repository's root directory. + +Each hook parameter is passed in as an environment variable; the name +is upper-cased, and prefixed with the string ``\texttt{HG\_}''. + +With the exception of hook parameters, Mercurial does not set or +modify any environment variables when running a hook. This is useful +to remember if you are writing a site-wide hook that may be run by a +number of different users with differing environment variables set. +In multi-user situations, you should not rely on environment variables +being set to the values you have in your environment when testing the +hook. + +\subsection{Telling Mercurial to use an in-process hook} + +The \hgrc\ syntax for defining an in-process hook is slightly +different than for an executable hook. The value of the hook must +start with the text ``\texttt{python:}'', and continue with the +fully-qualified name of a callable object to use as the hook's value. + +The module in which a hook lives is automatically imported when a hook +is run. So long as you have the module name and \envar{PYTHONPATH} +right, it should ``just work''. + +The following \hgrc\ example snippet illustrates the syntax and +meaning of the notions we just described. +\begin{codesample2} + [hooks] + commit.example = python:mymodule.submodule.myhook +\end{codesample2} +When Mercurial runs the \texttt{commit.example} hook, it imports +\texttt{mymodule.submodule}, looks for the callable object named +\texttt{myhook}, and calls it. + +\subsection{Writing an in-process hook} + +The simplest in-process hook does nothing, but illustrates the basic +shape of the hook API: +\begin{codesample2} + def myhook(ui, repo, **kwargs): + pass +\end{codesample2} +The first argument to a Python hook is always a +\pymodclass{mercurial.ui}{ui} object. The second is a repository object; +at the moment, it is always an instance of +\pymodclass{mercurial.localrepo}{localrepository}. Following these two +arguments are other keyword arguments. Which ones are passed in +depends on the hook being called, but a hook can ignore arguments it +doesn't care about by dropping them into a keyword argument dict, as +with \texttt{**kwargs} above. + +\section{Some hook examples} + +\subsection{Writing meaningful commit messages} + +It's hard to imagine a useful commit message being very short. The +simple \hook{pretxncommit} hook of figure~\ref{ex:hook:msglen.go} +will prevent you from committing a changeset with a message that is +less than ten bytes long. + +\begin{figure}[ht] + \interaction{hook.msglen.go} + \caption{A hook that forbids overly short commit messages} + \label{ex:hook:msglen.go} +\end{figure} + +\subsection{Checking for trailing whitespace} + +An interesting use of a commit-related hook is to help you to write +cleaner code. A simple example of ``cleaner code'' is the dictum that +a change should not add any new lines of text that contain ``trailing +whitespace''. Trailing whitespace is a series of space and tab +characters at the end of a line of text. In most cases, trailing +whitespace is unnecessary, invisible noise, but it is occasionally +problematic, and people often prefer to get rid of it. + +You can use either the \hook{precommit} or \hook{pretxncommit} hook to +tell whether you have a trailing whitespace problem. If you use the +\hook{precommit} hook, the hook will not know which files you are +committing, so it will have to check every modified file in the +repository for trailing white space. If you want to commit a change +to just the file \filename{foo}, but the file \filename{bar} contains +trailing whitespace, doing a check in the \hook{precommit} hook will +prevent you from committing \filename{foo} due to the problem with +\filename{bar}. This doesn't seem right. + +Should you choose the \hook{pretxncommit} hook, the check won't occur +until just before the transaction for the commit completes. This will +allow you to check for problems only the exact files that are being +committed. However, if you entered the commit message interactively +and the hook fails, the transaction will roll back; you'll have to +re-enter the commit message after you fix the trailing whitespace and +run \hgcmd{commit} again. + +\begin{figure}[ht] + \interaction{hook.ws.simple} + \caption{A simple hook that checks for trailing whitespace} + \label{ex:hook:ws.simple} +\end{figure} + +Figure~\ref{ex:hook:ws.simple} introduces a simple \hook{pretxncommit} +hook that checks for trailing whitespace. This hook is short, but not +very helpful. It exits with an error status if a change adds a line +with trailing whitespace to any file, but does not print any +information that might help us to identify the offending file or +line. It also has the nice property of not paying attention to +unmodified lines; only lines that introduce new trailing whitespace +cause problems. + +\begin{figure}[ht] + \interaction{hook.ws.better} + \caption{A better trailing whitespace hook} + \label{ex:hook:ws.better} +\end{figure} + +The example of figure~\ref{ex:hook:ws.better} is much more complex, +but also more useful. It parses a unified diff to see if any lines +add trailing whitespace, and prints the name of the file and the line +number of each such occurrence. Even better, if the change adds +trailing whitespace, this hook saves the commit comment and prints the +name of the save file before exiting and telling Mercurial to roll the +transaction back, so you can use +\hgcmdargs{commit}{\hgopt{commit}{-l}~\emph{filename}} to reuse the +saved commit message once you've corrected the problem. + +As a final aside, note in figure~\ref{ex:hook:ws.better} the use of +\command{perl}'s in-place editing feature to get rid of trailing +whitespace from a file. This is concise and useful enough that I will +reproduce it here. +\begin{codesample2} + perl -pi -e 's,\\s+\$,,' filename +\end{codesample2} + +\section{Bundled hooks} + +Mercurial ships with several bundled hooks. You can find them in the +\dirname{hgext} directory of a Mercurial source tree. If you are +using a Mercurial binary package, the hooks will be located in the +\dirname{hgext} directory of wherever your package installer put +Mercurial. + +\subsection{\hgext{acl}---access control for parts of a repository} + +The \hgext{acl} extension lets you control which remote users are +allowed to push changesets to a networked server. You can protect any +portion of a repository (including the entire repo), so that a +specific remote user can push changes that do not affect the protected +portion. + +This extension implements access control based on the identity of the +user performing a push, \emph{not} on who committed the changesets +they're pushing. It makes sense to use this hook only if you have a +locked-down server environment that authenticates remote users, and +you want to be sure that only specific users are allowed to push +changes to that server. + +\subsubsection{Configuring the \hook{acl} hook} + +In order to manage incoming changesets, the \hgext{acl} hook must be +used as a \hook{pretxnchangegroup} hook. This lets it see which files +are modified by each incoming changeset, and roll back a group of +changesets if they modify ``forbidden'' files. Example: +\begin{codesample2} + [hooks] + pretxnchangegroup.acl = python:hgext.acl.hook +\end{codesample2} + +The \hgext{acl} extension is configured using three sections. + +The \rcsection{acl} section has only one entry, \rcitem{acl}{sources}, +which lists the sources of incoming changesets that the hook should +pay attention to. You don't normally need to configure this section. +\begin{itemize} +\item[\rcitem{acl}{serve}] Control incoming changesets that are arriving + from a remote repository over http or ssh. This is the default + value of \rcitem{acl}{sources}, and usually the only setting you'll + need for this configuration item. +\item[\rcitem{acl}{pull}] Control incoming changesets that are + arriving via a pull from a local repository. +\item[\rcitem{acl}{push}] Control incoming changesets that are + arriving via a push from a local repository. +\item[\rcitem{acl}{bundle}] Control incoming changesets that are + arriving from another repository via a bundle. +\end{itemize} + +The \rcsection{acl.allow} section controls the users that are allowed to +add changesets to the repository. If this section is not present, all +users that are not explicitly denied are allowed. If this section is +present, all users that are not explicitly allowed are denied (so an +empty section means that all users are denied). + +The \rcsection{acl.deny} section determines which users are denied +from adding changesets to the repository. If this section is not +present or is empty, no users are denied. + +The syntaxes for the \rcsection{acl.allow} and \rcsection{acl.deny} +sections are identical. On the left of each entry is a glob pattern +that matches files or directories, relative to the root of the +repository; on the right, a user name. + +In the following example, the user \texttt{docwriter} can only push +changes to the \dirname{docs} subtree of the repository, while +\texttt{intern} can push changes to any file or directory except +\dirname{source/sensitive}. +\begin{codesample2} + [acl.allow] + docs/** = docwriter + + [acl.deny] + source/sensitive/** = intern +\end{codesample2} + +\subsubsection{Testing and troubleshooting} + +If you want to test the \hgext{acl} hook, run it with Mercurial's +debugging output enabled. Since you'll probably be running it on a +server where it's not convenient (or sometimes possible) to pass in +the \hggopt{--debug} option, don't forget that you can enable +debugging output in your \hgrc: +\begin{codesample2} + [ui] + debug = true +\end{codesample2} +With this enabled, the \hgext{acl} hook will print enough information +to let you figure out why it is allowing or forbidding pushes from +specific users. + +\subsection{\hgext{bugzilla}---integration with Bugzilla} + +The \hgext{bugzilla} extension adds a comment to a Bugzilla bug +whenever it finds a reference to that bug ID in a commit comment. You +can install this hook on a shared server, so that any time a remote +user pushes changes to this server, the hook gets run. + +It adds a comment to the bug that looks like this (you can configure +the contents of the comment---see below): +\begin{codesample2} + Changeset aad8b264143a, made by Joe User in + the frobnitz repository, refers to this bug. + + For complete details, see + http://hg.domain.com/frobnitz?cmd=changeset;node=aad8b264143a + + Changeset description: + Fix bug 10483 by guarding against some NULL pointers +\end{codesample2} +The value of this hook is that it automates the process of updating a +bug any time a changeset refers to it. If you configure the hook +properly, it makes it easy for people to browse straight from a +Bugzilla bug to a changeset that refers to that bug. + +You can use the code in this hook as a starting point for some more +exotic Bugzilla integration recipes. Here are a few possibilities: +\begin{itemize} +\item Require that every changeset pushed to the server have a valid + bug~ID in its commit comment. In this case, you'd want to configure + the hook as a \hook{pretxncommit} hook. This would allow the hook + to reject changes that didn't contain bug IDs. +\item Allow incoming changesets to automatically modify the + \emph{state} of a bug, as well as simply adding a comment. For + example, the hook could recognise the string ``fixed bug 31337'' as + indicating that it should update the state of bug 31337 to + ``requires testing''. +\end{itemize} + +\subsubsection{Configuring the \hook{bugzilla} hook} +\label{sec:hook:bugzilla:config} + +You should configure this hook in your server's \hgrc\ as an +\hook{incoming} hook, for example as follows: +\begin{codesample2} + [hooks] + incoming.bugzilla = python:hgext.bugzilla.hook +\end{codesample2} + +Because of the specialised nature of this hook, and because Bugzilla +was not written with this kind of integration in mind, configuring +this hook is a somewhat involved process. + +Before you begin, you must install the MySQL bindings for Python on +the host(s) where you'll be running the hook. If this is not +available as a binary package for your system, you can download it +from~\cite{web:mysql-python}. + +Configuration information for this hook lives in the +\rcsection{bugzilla} section of your \hgrc. +\begin{itemize} +\item[\rcitem{bugzilla}{version}] The version of Bugzilla installed on + the server. The database schema that Bugzilla uses changes + occasionally, so this hook has to know exactly which schema to use. + At the moment, the only version supported is \texttt{2.16}. +\item[\rcitem{bugzilla}{host}] The hostname of the MySQL server that + stores your Bugzilla data. The database must be configured to allow + connections from whatever host you are running the \hook{bugzilla} + hook on. +\item[\rcitem{bugzilla}{user}] The username with which to connect to + the MySQL server. The database must be configured to allow this + user to connect from whatever host you are running the + \hook{bugzilla} hook on. This user must be able to access and + modify Bugzilla tables. The default value of this item is + \texttt{bugs}, which is the standard name of the Bugzilla user in a + MySQL database. +\item[\rcitem{bugzilla}{password}] The MySQL password for the user you + configured above. This is stored as plain text, so you should make + sure that unauthorised users cannot read the \hgrc\ file where you + store this information. +\item[\rcitem{bugzilla}{db}] The name of the Bugzilla database on the + MySQL server. The default value of this item is \texttt{bugs}, + which is the standard name of the MySQL database where Bugzilla + stores its data. +\item[\rcitem{bugzilla}{notify}] If you want Bugzilla to send out a + notification email to subscribers after this hook has added a + comment to a bug, you will need this hook to run a command whenever + it updates the database. The command to run depends on where you + have installed Bugzilla, but it will typically look something like + this, if you have Bugzilla installed in + \dirname{/var/www/html/bugzilla}: + \begin{codesample4} + cd /var/www/html/bugzilla && ./processmail %s nobody@nowhere.com + \end{codesample4} + The Bugzilla \texttt{processmail} program expects to be given a + bug~ID (the hook replaces ``\texttt{\%s}'' with the bug~ID) and an + email address. It also expects to be able to write to some files in + the directory that it runs in. If Bugzilla and this hook are not + installed on the same machine, you will need to find a way to run + \texttt{processmail} on the server where Bugzilla is installed. +\end{itemize} + +\subsubsection{Mapping committer names to Bugzilla user names} + +By default, the \hgext{bugzilla} hook tries to use the email address +of a changeset's committer as the Bugzilla user name with which to +update a bug. If this does not suit your needs, you can map committer +email addresses to Bugzilla user names using a \rcsection{usermap} +section. + +Each item in the \rcsection{usermap} section contains an email address +on the left, and a Bugzilla user name on the right. +\begin{codesample2} + [usermap] + jane.user@example.com = jane +\end{codesample2} +You can either keep the \rcsection{usermap} data in a normal \hgrc, or +tell the \hgext{bugzilla} hook to read the information from an +external \filename{usermap} file. In the latter case, you can store +\filename{usermap} data by itself in (for example) a user-modifiable +repository. This makes it possible to let your users maintain their +own \rcitem{bugzilla}{usermap} entries. The main \hgrc\ file might +look like this: +\begin{codesample2} + # regular hgrc file refers to external usermap file + [bugzilla] + usermap = /home/hg/repos/userdata/bugzilla-usermap.conf +\end{codesample2} +While the \filename{usermap} file that it refers to might look like +this: +\begin{codesample2} + # bugzilla-usermap.conf - inside a hg repository + [usermap] + stephanie@example.com = steph +\end{codesample2} + +\subsubsection{Configuring the text that gets added to a bug} + +You can configure the text that this hook adds as a comment; you +specify it in the form of a Mercurial template. Several \hgrc\ +entries (still in the \rcsection{bugzilla} section) control this +behaviour. +\begin{itemize} +\item[\texttt{strip}] The number of leading path elements to strip + from a repository's path name to construct a partial path for a URL. + For example, if the repositories on your server live under + \dirname{/home/hg/repos}, and you have a repository whose path is + \dirname{/home/hg/repos/app/tests}, then setting \texttt{strip} to + \texttt{4} will give a partial path of \dirname{app/tests}. The + hook will make this partial path available when expanding a + template, as \texttt{webroot}. +\item[\texttt{template}] The text of the template to use. In addition + to the usual changeset-related variables, this template can use + \texttt{hgweb} (the value of the \texttt{hgweb} configuration item + above) and \texttt{webroot} (the path constructed using + \texttt{strip} above). +\end{itemize} + +In addition, you can add a \rcitem{web}{baseurl} item to the +\rcsection{web} section of your \hgrc. The \hgext{bugzilla} hook will +make this available when expanding a template, as the base string to +use when constructing a URL that will let users browse from a Bugzilla +comment to view a changeset. Example: +\begin{codesample2} + [web] + baseurl = http://hg.domain.com/ +\end{codesample2} + +Here is an example set of \hgext{bugzilla} hook config information. +\begin{codesample2} + [bugzilla] + host = bugzilla.example.com + password = mypassword + version = 2.16 + # server-side repos live in /home/hg/repos, so strip 4 leading + # separators + strip = 4 + hgweb = http://hg.example.com/ + usermap = /home/hg/repos/notify/bugzilla.conf + template = Changeset \{node|short\}, made by \{author\} in the \{webroot\} + repo, refers to this bug.\\nFor complete details, see + \{hgweb\}\{webroot\}?cmd=changeset;node=\{node|short\}\\nChangeset + description:\\n\\t\{desc|tabindent\} +\end{codesample2} + +\subsubsection{Testing and troubleshooting} + +The most common problems with configuring the \hgext{bugzilla} hook +relate to running Bugzilla's \filename{processmail} script and mapping +committer names to user names. + +Recall from section~\ref{sec:hook:bugzilla:config} above that the user +that runs the Mercurial process on the server is also the one that +will run the \filename{processmail} script. The +\filename{processmail} script sometimes causes Bugzilla to write to +files in its configuration directory, and Bugzilla's configuration +files are usually owned by the user that your web server runs under. + +You can cause \filename{processmail} to be run with the suitable +user's identity using the \command{sudo} command. Here is an example +entry for a \filename{sudoers} file. +\begin{codesample2} + hg_user = (httpd_user) NOPASSWD: /var/www/html/bugzilla/processmail-wrapper %s +\end{codesample2} +This allows the \texttt{hg\_user} user to run a +\filename{processmail-wrapper} program under the identity of +\texttt{httpd\_user}. + +This indirection through a wrapper script is necessary, because +\filename{processmail} expects to be run with its current directory +set to wherever you installed Bugzilla; you can't specify that kind of +constraint in a \filename{sudoers} file. The contents of the wrapper +script are simple: +\begin{codesample2} + #!/bin/sh + cd `dirname $0` && ./processmail "$1" nobody@example.com +\end{codesample2} +It doesn't seem to matter what email address you pass to +\filename{processmail}. + +If your \rcsection{usermap} is not set up correctly, users will see an +error message from the \hgext{bugzilla} hook when they push changes +to the server. The error message will look like this: +\begin{codesample2} + cannot find bugzilla user id for john.q.public@example.com +\end{codesample2} +What this means is that the committer's address, +\texttt{john.q.public@example.com}, is not a valid Bugzilla user name, +nor does it have an entry in your \rcsection{usermap} that maps it to +a valid Bugzilla user name. + +\subsection{\hgext{notify}---send email notifications} + +Although Mercurial's built-in web server provides RSS feeds of changes +in every repository, many people prefer to receive change +notifications via email. The \hgext{notify} hook lets you send out +notifications to a set of email addresses whenever changesets arrive +that those subscribers are interested in. + +As with the \hgext{bugzilla} hook, the \hgext{notify} hook is +template-driven, so you can customise the contents of the notification +messages that it sends. + +By default, the \hgext{notify} hook includes a diff of every changeset +that it sends out; you can limit the size of the diff, or turn this +feature off entirely. It is useful for letting subscribers review +changes immediately, rather than clicking to follow a URL. + +\subsubsection{Configuring the \hgext{notify} hook} + +You can set up the \hgext{notify} hook to send one email message per +incoming changeset, or one per incoming group of changesets (all those +that arrived in a single pull or push). +\begin{codesample2} + [hooks] + # send one email per group of changes + changegroup.notify = python:hgext.notify.hook + # send one email per change + incoming.notify = python:hgext.notify.hook +\end{codesample2} + +Configuration information for this hook lives in the +\rcsection{notify} section of a \hgrc\ file. +\begin{itemize} +\item[\rcitem{notify}{test}] By default, this hook does not send out + email at all; instead, it prints the message that it \emph{would} + send. Set this item to \texttt{false} to allow email to be sent. + The reason that sending of email is turned off by default is that it + takes several tries to configure this extension exactly as you would + like, and it would be bad form to spam subscribers with a number of + ``broken'' notifications while you debug your configuration. +\item[\rcitem{notify}{config}] The path to a configuration file that + contains subscription information. This is kept separate from the + main \hgrc\ so that you can maintain it in a repository of its own. + People can then clone that repository, update their subscriptions, + and push the changes back to your server. +\item[\rcitem{notify}{strip}] The number of leading path separator + characters to strip from a repository's path, when deciding whether + a repository has subscribers. For example, if the repositories on + your server live in \dirname{/home/hg/repos}, and \hgext{notify} is + considering a repository named \dirname{/home/hg/repos/shared/test}, + setting \rcitem{notify}{strip} to \texttt{4} will cause + \hgext{notify} to trim the path it considers down to + \dirname{shared/test}, and it will match subscribers against that. +\item[\rcitem{notify}{template}] The template text to use when sending + messages. This specifies both the contents of the message header + and its body. +\item[\rcitem{notify}{maxdiff}] The maximum number of lines of diff + data to append to the end of a message. If a diff is longer than + this, it is truncated. By default, this is set to 300. Set this to + \texttt{0} to omit diffs from notification emails. +\item[\rcitem{notify}{sources}] A list of sources of changesets to + consider. This lets you limit \hgext{notify} to only sending out + email about changes that remote users pushed into this repository + via a server, for example. See section~\ref{sec:hook:sources} for + the sources you can specify here. +\end{itemize} + +If you set the \rcitem{web}{baseurl} item in the \rcsection{web} +section, you can use it in a template; it will be available as +\texttt{webroot}. + +Here is an example set of \hgext{notify} configuration information. +\begin{codesample2} + [notify] + # really send email + test = false + # subscriber data lives in the notify repo + config = /home/hg/repos/notify/notify.conf + # repos live in /home/hg/repos on server, so strip 4 "/" chars + strip = 4 + template = X-Hg-Repo: \{webroot\} + Subject: \{webroot\}: \{desc|firstline|strip\} + From: \{author\} + + changeset \{node|short\} in \{root\} + details: \{baseurl\}\{webroot\}?cmd=changeset;node=\{node|short\} + description: + \{desc|tabindent|strip\} + + [web] + baseurl = http://hg.example.com/ +\end{codesample2} + +This will produce a message that looks like the following: +\begin{codesample2} + X-Hg-Repo: tests/slave + Subject: tests/slave: Handle error case when slave has no buffers + Date: Wed, 2 Aug 2006 15:25:46 -0700 (PDT) + + changeset 3cba9bfe74b5 in /home/hg/repos/tests/slave + details: http://hg.example.com/tests/slave?cmd=changeset;node=3cba9bfe74b5 + description: + Handle error case when slave has no buffers + diffs (54 lines): + + diff -r 9d95df7cf2ad -r 3cba9bfe74b5 include/tests.h + --- a/include/tests.h Wed Aug 02 15:19:52 2006 -0700 + +++ b/include/tests.h Wed Aug 02 15:25:26 2006 -0700 + @@ -212,6 +212,15 @@ static __inline__ void test_headers(void *h) + [...snip...] +\end{codesample2} + +\subsubsection{Testing and troubleshooting} + +Do not forget that by default, the \hgext{notify} extension \emph{will + not send any mail} until you explicitly configure it to do so, by +setting \rcitem{notify}{test} to \texttt{false}. Until you do that, +it simply prints the message it \emph{would} send. + +\section{Information for writers of hooks} +\label{sec:hook:ref} + +\subsection{In-process hook execution} + +An in-process hook is called with arguments of the following form: +\begin{codesample2} + def myhook(ui, repo, **kwargs): + pass +\end{codesample2} +The \texttt{ui} parameter is a \pymodclass{mercurial.ui}{ui} object. +The \texttt{repo} parameter is a +\pymodclass{mercurial.localrepo}{localrepository} object. The +names and values of the \texttt{**kwargs} parameters depend on the +hook being invoked, with the following common features: +\begin{itemize} +\item If a parameter is named \texttt{node} or + \texttt{parent\emph{N}}, it will contain a hexadecimal changeset ID. + The empty string is used to represent ``null changeset ID'' instead + of a string of zeroes. +\item If a parameter is named \texttt{url}, it will contain the URL of + a remote repository, if that can be determined. +\item Boolean-valued parameters are represented as Python + \texttt{bool} objects. +\end{itemize} + +An in-process hook is called without a change to the process's working +directory (unlike external hooks, which are run in the root of the +repository). It must not change the process's working directory, or +it will cause any calls it makes into the Mercurial API to fail. + +If a hook returns a boolean ``false'' value, it is considered to have +succeeded. If it returns a boolean ``true'' value or raises an +exception, it is considered to have failed. A useful way to think of +the calling convention is ``tell me if you fail''. + +Note that changeset IDs are passed into Python hooks as hexadecimal +strings, not the binary hashes that Mercurial's APIs normally use. To +convert a hash from hex to binary, use the +\pymodfunc{mercurial.node}{bin} function. + +\subsection{External hook execution} + +An external hook is passed to the shell of the user running Mercurial. +Features of that shell, such as variable substitution and command +redirection, are available. The hook is run in the root directory of +the repository (unlike in-process hooks, which are run in the same +directory that Mercurial was run in). + +Hook parameters are passed to the hook as environment variables. Each +environment variable's name is converted in upper case and prefixed +with the string ``\texttt{HG\_}''. For example, if the name of a +parameter is ``\texttt{node}'', the name of the environment variable +representing that parameter will be ``\texttt{HG\_NODE}''. + +A boolean parameter is represented as the string ``\texttt{1}'' for +``true'', ``\texttt{0}'' for ``false''. If an environment variable is +named \envar{HG\_NODE}, \envar{HG\_PARENT1} or \envar{HG\_PARENT2}, it +contains a changeset ID represented as a hexadecimal string. The +empty string is used to represent ``null changeset ID'' instead of a +string of zeroes. If an environment variable is named +\envar{HG\_URL}, it will contain the URL of a remote repository, if +that can be determined. + +If a hook exits with a status of zero, it is considered to have +succeeded. If it exits with a non-zero status, it is considered to +have failed. + +\subsection{Finding out where changesets come from} + +A hook that involves the transfer of changesets between a local +repository and another may be able to find out information about the +``far side''. Mercurial knows \emph{how} changes are being +transferred, and in many cases \emph{where} they are being transferred +to or from. + +\subsubsection{Sources of changesets} +\label{sec:hook:sources} + +Mercurial will tell a hook what means are, or were, used to transfer +changesets between repositories. This is provided by Mercurial in a +Python parameter named \texttt{source}, or an environment variable named +\envar{HG\_SOURCE}. + +\begin{itemize} +\item[\texttt{serve}] Changesets are transferred to or from a remote + repository over http or ssh. +\item[\texttt{pull}] Changesets are being transferred via a pull from + one repository into another. +\item[\texttt{push}] Changesets are being transferred via a push from + one repository into another. +\item[\texttt{bundle}] Changesets are being transferred to or from a + bundle. +\end{itemize} + +\subsubsection{Where changes are going---remote repository URLs} +\label{sec:hook:url} + +When possible, Mercurial will tell a hook the location of the ``far +side'' of an activity that transfers changeset data between +repositories. This is provided by Mercurial in a Python parameter +named \texttt{url}, or an environment variable named \envar{HG\_URL}. + +This information is not always known. If a hook is invoked in a +repository that is being served via http or ssh, Mercurial cannot tell +where the remote repository is, but it may know where the client is +connecting from. In such cases, the URL will take one of the +following forms: +\begin{itemize} +\item \texttt{remote:ssh:\emph{ip-address}}---remote ssh client, at + the given IP address. +\item \texttt{remote:http:\emph{ip-address}}---remote http client, at + the given IP address. If the client is using SSL, this will be of + the form \texttt{remote:https:\emph{ip-address}}. +\item Empty---no information could be discovered about the remote + client. +\end{itemize} + +\section{Hook reference} + +\subsection{\hook{changegroup}---after remote changesets added} +\label{sec:hook:changegroup} + +This hook is run after a group of pre-existing changesets has been +added to the repository, for example via a \hgcmd{pull} or +\hgcmd{unbundle}. This hook is run once per operation that added one +or more changesets. This is in contrast to the \hook{incoming} hook, +which is run once per changeset, regardless of whether the changesets +arrive in a group. + +Some possible uses for this hook include kicking off an automated +build or test of the added changesets, updating a bug database, or +notifying subscribers that a repository contains new changes. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{node}] A changeset ID. The changeset ID of the first + changeset in the group that was added. All changesets between this + and \index{tags!\texttt{tip}}\texttt{tip}, inclusive, were added by + a single \hgcmd{pull}, \hgcmd{push} or \hgcmd{unbundle}. +\item[\texttt{source}] A string. The source of these changes. See + section~\ref{sec:hook:sources} for details. +\item[\texttt{url}] A URL. The location of the remote repository, if + known. See section~\ref{sec:hook:url} for more information. +\end{itemize} + +See also: \hook{incoming} (section~\ref{sec:hook:incoming}), +\hook{prechangegroup} (section~\ref{sec:hook:prechangegroup}), +\hook{pretxnchangegroup} (section~\ref{sec:hook:pretxnchangegroup}) + +\subsection{\hook{commit}---after a new changeset is created} +\label{sec:hook:commit} + +This hook is run after a new changeset has been created. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{node}] A changeset ID. The changeset ID of the newly + committed changeset. +\item[\texttt{parent1}] A changeset ID. The changeset ID of the first + parent of the newly committed changeset. +\item[\texttt{parent2}] A changeset ID. The changeset ID of the second + parent of the newly committed changeset. +\end{itemize} + +See also: \hook{precommit} (section~\ref{sec:hook:precommit}), +\hook{pretxncommit} (section~\ref{sec:hook:pretxncommit}) + +\subsection{\hook{incoming}---after one remote changeset is added} +\label{sec:hook:incoming} + +This hook is run after a pre-existing changeset has been added to the +repository, for example via a \hgcmd{push}. If a group of changesets +was added in a single operation, this hook is called once for each +added changeset. + +You can use this hook for the same purposes as the \hook{changegroup} +hook (section~\ref{sec:hook:changegroup}); it's simply more convenient +sometimes to run a hook once per group of changesets, while other +times it's handier once per changeset. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{node}] A changeset ID. The ID of the newly added + changeset. +\item[\texttt{source}] A string. The source of these changes. See + section~\ref{sec:hook:sources} for details. +\item[\texttt{url}] A URL. The location of the remote repository, if + known. See section~\ref{sec:hook:url} for more information. +\end{itemize} + +See also: \hook{changegroup} (section~\ref{sec:hook:changegroup}) \hook{prechangegroup} (section~\ref{sec:hook:prechangegroup}), \hook{pretxnchangegroup} (section~\ref{sec:hook:pretxnchangegroup}) + +\subsection{\hook{outgoing}---after changesets are propagated} +\label{sec:hook:outgoing} + +This hook is run after a group of changesets has been propagated out +of this repository, for example by a \hgcmd{push} or \hgcmd{bundle} +command. + +One possible use for this hook is to notify administrators that +changes have been pulled. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{node}] A changeset ID. The changeset ID of the first + changeset of the group that was sent. +\item[\texttt{source}] A string. The source of the of the operation + (see section~\ref{sec:hook:sources}). If a remote client pulled + changes from this repository, \texttt{source} will be + \texttt{serve}. If the client that obtained changes from this + repository was local, \texttt{source} will be \texttt{bundle}, + \texttt{pull}, or \texttt{push}, depending on the operation the + client performed. +\item[\texttt{url}] A URL. The location of the remote repository, if + known. See section~\ref{sec:hook:url} for more information. +\end{itemize} + +See also: \hook{preoutgoing} (section~\ref{sec:hook:preoutgoing}) + +\subsection{\hook{prechangegroup}---before starting to add remote changesets} +\label{sec:hook:prechangegroup} + +This controlling hook is run before Mercurial begins to add a group of +changesets from another repository. + +This hook does not have any information about the changesets to be +added, because it is run before transmission of those changesets is +allowed to begin. If this hook fails, the changesets will not be +transmitted. + +One use for this hook is to prevent external changes from being added +to a repository. For example, you could use this to ``freeze'' a +server-hosted branch temporarily or permanently so that users cannot +push to it, while still allowing a local administrator to modify the +repository. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{source}] A string. The source of these changes. See + section~\ref{sec:hook:sources} for details. +\item[\texttt{url}] A URL. The location of the remote repository, if + known. See section~\ref{sec:hook:url} for more information. +\end{itemize} + +See also: \hook{changegroup} (section~\ref{sec:hook:changegroup}), +\hook{incoming} (section~\ref{sec:hook:incoming}), , +\hook{pretxnchangegroup} (section~\ref{sec:hook:pretxnchangegroup}) + +\subsection{\hook{precommit}---before starting to commit a changeset} +\label{sec:hook:precommit} + +This hook is run before Mercurial begins to commit a new changeset. +It is run before Mercurial has any of the metadata for the commit, +such as the files to be committed, the commit message, or the commit +date. + +One use for this hook is to disable the ability to commit new +changesets, while still allowing incoming changesets. Another is to +run a build or test, and only allow the commit to begin if the build +or test succeeds. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{parent1}] A changeset ID. The changeset ID of the first + parent of the working directory. +\item[\texttt{parent2}] A changeset ID. The changeset ID of the second + parent of the working directory. +\end{itemize} +If the commit proceeds, the parents of the working directory will +become the parents of the new changeset. + +See also: \hook{commit} (section~\ref{sec:hook:commit}), +\hook{pretxncommit} (section~\ref{sec:hook:pretxncommit}) + +\subsection{\hook{preoutgoing}---before starting to propagate changesets} +\label{sec:hook:preoutgoing} + +This hook is invoked before Mercurial knows the identities of the +changesets to be transmitted. + +One use for this hook is to prevent changes from being transmitted to +another repository. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{source}] A string. The source of the operation that is + attempting to obtain changes from this repository (see + section~\ref{sec:hook:sources}). See the documentation for the + \texttt{source} parameter to the \hook{outgoing} hook, in + section~\ref{sec:hook:outgoing}, for possible values of this + parameter. +\item[\texttt{url}] A URL. The location of the remote repository, if + known. See section~\ref{sec:hook:url} for more information. +\end{itemize} + +See also: \hook{outgoing} (section~\ref{sec:hook:outgoing}) + +\subsection{\hook{pretag}---before tagging a changeset} +\label{sec:hook:pretag} + +This controlling hook is run before a tag is created. If the hook +succeeds, creation of the tag proceeds. If the hook fails, the tag is +not created. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{local}] A boolean. Whether the tag is local to this + repository instance (i.e.~stored in \sfilename{.hg/localtags}) or + managed by Mercurial (stored in \sfilename{.hgtags}). +\item[\texttt{node}] A changeset ID. The ID of the changeset to be tagged. +\item[\texttt{tag}] A string. The name of the tag to be created. +\end{itemize} + +If the tag to be created is revision-controlled, the \hook{precommit} +and \hook{pretxncommit} hooks (sections~\ref{sec:hook:commit} +and~\ref{sec:hook:pretxncommit}) will also be run. + +See also: \hook{tag} (section~\ref{sec:hook:tag}) + +\subsection{\hook{pretxnchangegroup}---before completing addition of + remote changesets} +\label{sec:hook:pretxnchangegroup} + +This controlling hook is run before a transaction---that manages the +addition of a group of new changesets from outside the +repository---completes. If the hook succeeds, the transaction +completes, and all of the changesets become permanent within this +repository. If the hook fails, the transaction is rolled back, and +the data for the changesets is erased. + +This hook can access the metadata associated with the almost-added +changesets, but it should not do anything permanent with this data. +It must also not modify the working directory. + +While this hook is running, if other Mercurial processes access this +repository, they will be able to see the almost-added changesets as if +they are permanent. This may lead to race conditions if you do not +take steps to avoid them. + +This hook can be used to automatically vet a group of changesets. If +the hook fails, all of the changesets are ``rejected'' when the +transaction rolls back. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{node}] A changeset ID. The changeset ID of the first + changeset in the group that was added. All changesets between this + and \index{tags!\texttt{tip}}\texttt{tip}, inclusive, were added by + a single \hgcmd{pull}, \hgcmd{push} or \hgcmd{unbundle}. +\item[\texttt{source}] A string. The source of these changes. See + section~\ref{sec:hook:sources} for details. +\item[\texttt{url}] A URL. The location of the remote repository, if + known. See section~\ref{sec:hook:url} for more information. +\end{itemize} + +See also: \hook{changegroup} (section~\ref{sec:hook:changegroup}), +\hook{incoming} (section~\ref{sec:hook:incoming}), +\hook{prechangegroup} (section~\ref{sec:hook:prechangegroup}) + +\subsection{\hook{pretxncommit}---before completing commit of new changeset} +\label{sec:hook:pretxncommit} + +This controlling hook is run before a transaction---that manages a new +commit---completes. If the hook succeeds, the transaction completes +and the changeset becomes permanent within this repository. If the +hook fails, the transaction is rolled back, and the commit data is +erased. + +This hook can access the metadata associated with the almost-new +changeset, but it should not do anything permanent with this data. It +must also not modify the working directory. + +While this hook is running, if other Mercurial processes access this +repository, they will be able to see the almost-new changeset as if it +is permanent. This may lead to race conditions if you do not take +steps to avoid them. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{node}] A changeset ID. The changeset ID of the newly + committed changeset. +\item[\texttt{parent1}] A changeset ID. The changeset ID of the first + parent of the newly committed changeset. +\item[\texttt{parent2}] A changeset ID. The changeset ID of the second + parent of the newly committed changeset. +\end{itemize} + +See also: \hook{precommit} (section~\ref{sec:hook:precommit}) + +\subsection{\hook{preupdate}---before updating or merging working directory} +\label{sec:hook:preupdate} + +This controlling hook is run before an update or merge of the working +directory begins. It is run only if Mercurial's normal pre-update +checks determine that the update or merge can proceed. If the hook +succeeds, the update or merge may proceed; if it fails, the update or +merge does not start. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{parent1}] A changeset ID. The ID of the parent that the + working directory is to be updated to. If the working directory is + being merged, it will not change this parent. +\item[\texttt{parent2}] A changeset ID. Only set if the working + directory is being merged. The ID of the revision that the working + directory is being merged with. +\end{itemize} + +See also: \hook{update} (section~\ref{sec:hook:update}) + +\subsection{\hook{tag}---after tagging a changeset} +\label{sec:hook:tag} + +This hook is run after a tag has been created. + +Parameters to this hook: +\begin{itemize} +\item[\texttt{local}] A boolean. Whether the new tag is local to this + repository instance (i.e.~stored in \sfilename{.hg/localtags}) or + managed by Mercurial (stored in \sfilename{.hgtags}). +\item[\texttt{node}] A changeset ID. The ID of the changeset that was + tagged. +\item[\texttt{tag}] A string. The name of the tag that was created. +\end{itemize} + +If the created tag is revision-controlled, the \hook{commit} hook +(section~\ref{sec:hook:commit}) is run before this hook. + +See also: \hook{pretag} (section~\ref{sec:hook:pretag}) + +\subsection{\hook{update}---after updating or merging working directory} +\label{sec:hook:update} + +This hook is run after an update or merge of the working directory +completes. Since a merge can fail (if the external \command{hgmerge} +command fails to resolve conflicts in a file), this hook communicates +whether the update or merge completed cleanly. + +\begin{itemize} +\item[\texttt{error}] A boolean. Indicates whether the update or + merge completed successfully. +\item[\texttt{parent1}] A changeset ID. The ID of the parent that the + working directory was updated to. If the working directory was + merged, it will not have changed this parent. +\item[\texttt{parent2}] A changeset ID. Only set if the working + directory was merged. The ID of the revision that the working + directory was merged with. +\end{itemize} + +See also: \hook{preupdate} (section~\ref{sec:hook:preupdate}) + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 7f0af73f53ab -r 7e52f0cc4516 es/kdiff3.png Binary file es/kdiff3.png has changed diff -r 7f0af73f53ab -r 7e52f0cc4516 es/license.tex --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/es/license.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,138 @@ +\chapter{Open Publication License} +\label{cha:opl} + +Version 1.0, 8 June 1999 + +\section{Requirements on both unmodified and modified versions} + +The Open Publication works may be reproduced and distributed in whole +or in part, in any medium physical or electronic, provided that the +terms of this license are adhered to, and that this license or an +incorporation of it by reference (with any options elected by the +author(s) and/or publisher) is displayed in the reproduction. + +Proper form for an incorporation by reference is as follows: + +\begin{quote} + Copyright (c) \emph{year} by \emph{author's name or designee}. This + material may be distributed only subject to the terms and conditions + set forth in the Open Publication License, v\emph{x.y} or later (the + latest version is presently available at + \url{http://www.opencontent.org/openpub/}). +\end{quote} + +The reference must be immediately followed with any options elected by +the author(s) and/or publisher of the document (see +section~\ref{sec:opl:options}). + +Commercial redistribution of Open Publication-licensed material is +permitted. + +Any publication in standard (paper) book form shall require the +citation of the original publisher and author. The publisher and +author's names shall appear on all outer surfaces of the book. On all +outer surfaces of the book the original publisher's name shall be as +large as the title of the work and cited as possessive with respect to +the title. + +\section{Copyright} + +The copyright to each Open Publication is owned by its author(s) or +designee. + +\section{Scope of license} + +The following license terms apply to all Open Publication works, +unless otherwise explicitly stated in the document. + +Mere aggregation of Open Publication works or a portion of an Open +Publication work with other works or programs on the same media shall +not cause this license to apply to those other works. The aggregate +work shall contain a notice specifying the inclusion of the Open +Publication material and appropriate copyright notice. + +\textbf{Severability}. If any part of this license is found to be +unenforceable in any jurisdiction, the remaining portions of the +license remain in force. + +\textbf{No warranty}. Open Publication works are licensed and provided +``as is'' without warranty of any kind, express or implied, including, +but not limited to, the implied warranties of merchantability and +fitness for a particular purpose or a warranty of non-infringement. + +\section{Requirements on modified works} + +All modified versions of documents covered by this license, including +translations, anthologies, compilations and partial documents, must +meet the following requirements: + +\begin{enumerate} +\item The modified version must be labeled as such. +\item The person making the modifications must be identified and the + modifications dated. +\item Acknowledgement of the original author and publisher if + applicable must be retained according to normal academic citation + practices. +\item The location of the original unmodified document must be + identified. +\item The original author's (or authors') name(s) may not be used to + assert or imply endorsement of the resulting document without the + original author's (or authors') permission. +\end{enumerate} + +\section{Good-practice recommendations} + +In addition to the requirements of this license, it is requested from +and strongly recommended of redistributors that: + +\begin{enumerate} +\item If you are distributing Open Publication works on hardcopy or + CD-ROM, you provide email notification to the authors of your intent + to redistribute at least thirty days before your manuscript or media + freeze, to give the authors time to provide updated documents. This + notification should describe modifications, if any, made to the + document. +\item All substantive modifications (including deletions) be either + clearly marked up in the document or else described in an attachment + to the document. +\item Finally, while it is not mandatory under this license, it is + considered good form to offer a free copy of any hardcopy and CD-ROM + expression of an Open Publication-licensed work to its author(s). +\end{enumerate} + +\section{License options} +\label{sec:opl:options} + +The author(s) and/or publisher of an Open Publication-licensed +document may elect certain options by appending language to the +reference to or copy of the license. These options are considered part +of the license instance and must be included with the license (or its +incorporation by reference) in derived works. + +\begin{enumerate}[A] +\item To prohibit distribution of substantively modified versions + without the explicit permission of the author(s). ``Substantive + modification'' is defined as a change to the semantic content of the + document, and excludes mere changes in format or typographical + corrections. + + To accomplish this, add the phrase ``Distribution of substantively + modified versions of this document is prohibited without the + explicit permission of the copyright holder.'' to the license + reference or copy. + +\item To prohibit any publication of this work or derivative works in + whole or in part in standard (paper) book form for commercial + purposes is prohibited unless prior permission is obtained from the + copyright holder. + + To accomplish this, add the phrase ``Distribution of the work or + derivative of the work in any standard (paper) book form is + prohibited unless prior permission is obtained from the copyright + holder.'' to the license reference or copy. +\end{enumerate} + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 7f0af73f53ab -r 7e52f0cc4516 es/mq-collab.tex --- a/es/mq-collab.tex Sat Oct 18 14:35:43 2008 -0500 +++ b/es/mq-collab.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,393 @@ +\chapter{Advanced uses of Mercurial Queues} +\label{chap:mq-collab} + +While it's easy to pick up straightforward uses of Mercurial Queues, +use of a little discipline and some of MQ's less frequently used +capabilities makes it possible to work in complicated development +environments. + +In this chapter, I will use as an example a technique I have used to +manage the development of an Infiniband device driver for the Linux +kernel. The driver in question is large (at least as drivers go), +with 25,000 lines of code spread across 35 source files. It is +maintained by a small team of developers. + +While much of the material in this chapter is specific to Linux, the +same principles apply to any code base for which you're not the +primary owner, and upon which you need to do a lot of development. + +\section{The problem of many targets} + +The Linux kernel changes rapidly, and has never been internally +stable; developers frequently make drastic changes between releases. +This means that a version of the driver that works well with a +particular released version of the kernel will not even \emph{compile} +correctly against, typically, any other version. + +To maintain a driver, we have to keep a number of distinct versions of +Linux in mind. +\begin{itemize} +\item One target is the main Linux kernel development tree. + Maintenance of the code is in this case partly shared by other + developers in the kernel community, who make ``drive-by'' + modifications to the driver as they develop and refine kernel + subsystems. +\item We also maintain a number of ``backports'' to older versions of + the Linux kernel, to support the needs of customers who are running + older Linux distributions that do not incorporate our drivers. (To + \emph{backport} a piece of code is to modify it to work in an older + version of its target environment than the version it was developed + for.) +\item Finally, we make software releases on a schedule that is + necessarily not aligned with those used by Linux distributors and + kernel developers, so that we can deliver new features to customers + without forcing them to upgrade their entire kernels or + distributions. +\end{itemize} + +\subsection{Tempting approaches that don't work well} + +There are two ``standard'' ways to maintain a piece of software that +has to target many different environments. + +The first is to maintain a number of branches, each intended for a +single target. The trouble with this approach is that you must +maintain iron discipline in the flow of changes between repositories. +A new feature or bug fix must start life in a ``pristine'' repository, +then percolate out to every backport repository. Backport changes are +more limited in the branches they should propagate to; a backport +change that is applied to a branch where it doesn't belong will +probably stop the driver from compiling. + +The second is to maintain a single source tree filled with conditional +statements that turn chunks of code on or off depending on the +intended target. Because these ``ifdefs'' are not allowed in the +Linux kernel tree, a manual or automatic process must be followed to +strip them out and yield a clean tree. A code base maintained in this +fashion rapidly becomes a rat's nest of conditional blocks that are +difficult to understand and maintain. + +Neither of these approaches is well suited to a situation where you +don't ``own'' the canonical copy of a source tree. In the case of a +Linux driver that is distributed with the standard kernel, Linus's +tree contains the copy of the code that will be treated by the world +as canonical. The upstream version of ``my'' driver can be modified +by people I don't know, without me even finding out about it until +after the changes show up in Linus's tree. + +These approaches have the added weakness of making it difficult to +generate well-formed patches to submit upstream. + +In principle, Mercurial Queues seems like a good candidate to manage a +development scenario such as the above. While this is indeed the +case, MQ contains a few added features that make the job more +pleasant. + +\section{Conditionally applying patches with + guards} + +Perhaps the best way to maintain sanity with so many targets is to be +able to choose specific patches to apply for a given situation. MQ +provides a feature called ``guards'' (which originates with quilt's +\texttt{guards} command) that does just this. To start off, let's +create a simple repository for experimenting in. +\interaction{mq.guards.init} +This gives us a tiny repository that contains two patches that don't +have any dependencies on each other, because they touch different files. + +The idea behind conditional application is that you can ``tag'' a +patch with a \emph{guard}, which is simply a text string of your +choosing, then tell MQ to select specific guards to use when applying +patches. MQ will then either apply, or skip over, a guarded patch, +depending on the guards that you have selected. + +A patch can have an arbitrary number of guards; +each one is \emph{positive} (``apply this patch if this guard is +selected'') or \emph{negative} (``skip this patch if this guard is +selected''). A patch with no guards is always applied. + +\section{Controlling the guards on a patch} + +The \hgxcmd{mq}{qguard} command lets you determine which guards should +apply to a patch, or display the guards that are already in effect. +Without any arguments, it displays the guards on the current topmost +patch. +\interaction{mq.guards.qguard} +To set a positive guard on a patch, prefix the name of the guard with +a ``\texttt{+}''. +\interaction{mq.guards.qguard.pos} +To set a negative guard on a patch, prefix the name of the guard with +a ``\texttt{-}''. +\interaction{mq.guards.qguard.neg} + +\begin{note} + The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it + doesn't \emph{modify} them. What this means is that if you run + \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on + the same patch, the \emph{only} guard that will be set on it + afterwards is \texttt{+c}. +\end{note} + +Mercurial stores guards in the \sfilename{series} file; the form in +which they are stored is easy both to understand and to edit by hand. +(In other words, you don't have to use the \hgxcmd{mq}{qguard} command if +you don't want to; it's okay to simply edit the \sfilename{series} +file.) +\interaction{mq.guards.series} + +\section{Selecting the guards to use} + +The \hgxcmd{mq}{qselect} command determines which guards are active at a +given time. The effect of this is to determine which patches MQ will +apply the next time you run \hgxcmd{mq}{qpush}. It has no other effect; in +particular, it doesn't do anything to patches that are already +applied. + +With no arguments, the \hgxcmd{mq}{qselect} command lists the guards +currently in effect, one per line of output. Each argument is treated +as the name of a guard to apply. +\interaction{mq.guards.qselect.foo} +In case you're interested, the currently selected guards are stored in +the \sfilename{guards} file. +\interaction{mq.guards.qselect.cat} +We can see the effect the selected guards have when we run +\hgxcmd{mq}{qpush}. +\interaction{mq.guards.qselect.qpush} + +A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}'' +character. The name of a guard must not contain white space, but most +other characters are acceptable. If you try to use a guard with an +invalid name, MQ will complain: +\interaction{mq.guards.qselect.error} +Changing the selected guards changes the patches that are applied. +\interaction{mq.guards.qselect.quux} +You can see in the example below that negative guards take precedence +over positive guards. +\interaction{mq.guards.qselect.foobar} + +\section{MQ's rules for applying patches} + +The rules that MQ uses when deciding whether to apply a patch +are as follows. +\begin{itemize} +\item A patch that has no guards is always applied. +\item If the patch has any negative guard that matches any currently + selected guard, the patch is skipped. +\item If the patch has any positive guard that matches any currently + selected guard, the patch is applied. +\item If the patch has positive or negative guards, but none matches + any currently selected guard, the patch is skipped. +\end{itemize} + +\section{Trimming the work environment} + +In working on the device driver I mentioned earlier, I don't apply the +patches to a normal Linux kernel tree. Instead, I use a repository +that contains only a snapshot of the source files and headers that are +relevant to Infiniband development. This repository is~1\% the size +of a kernel repository, so it's easier to work with. + +I then choose a ``base'' version on top of which the patches are +applied. This is a snapshot of the Linux kernel tree as of a revision +of my choosing. When I take the snapshot, I record the changeset ID +from the kernel repository in the commit message. Since the snapshot +preserves the ``shape'' and content of the relevant parts of the +kernel tree, I can apply my patches on top of either my tiny +repository or a normal kernel tree. + +Normally, the base tree atop which the patches apply should be a +snapshot of a very recent upstream tree. This best facilitates the +development of patches that can easily be submitted upstream with few +or no modifications. + +\section{Dividing up the \sfilename{series} file} + +I categorise the patches in the \sfilename{series} file into a number +of logical groups. Each section of like patches begins with a block +of comments that describes the purpose of the patches that follow. + +The sequence of patch groups that I maintain follows. The ordering of +these groups is important; I'll describe why after I introduce the +groups. +\begin{itemize} +\item The ``accepted'' group. Patches that the development team has + submitted to the maintainer of the Infiniband subsystem, and which + he has accepted, but which are not present in the snapshot that the + tiny repository is based on. These are ``read only'' patches, + present only to transform the tree into a similar state as it is in + the upstream maintainer's repository. +\item The ``rework'' group. Patches that I have submitted, but that + the upstream maintainer has requested modifications to before he + will accept them. +\item The ``pending'' group. Patches that I have not yet submitted to + the upstream maintainer, but which we have finished working on. + These will be ``read only'' for a while. If the upstream maintainer + accepts them upon submission, I'll move them to the end of the + ``accepted'' group. If he requests that I modify any, I'll move + them to the beginning of the ``rework'' group. +\item The ``in progress'' group. Patches that are actively being + developed, and should not be submitted anywhere yet. +\item The ``backport'' group. Patches that adapt the source tree to + older versions of the kernel tree. +\item The ``do not ship'' group. Patches that for some reason should + never be submitted upstream. For example, one such patch might + change embedded driver identification strings to make it easier to + distinguish, in the field, between an out-of-tree version of the + driver and a version shipped by a distribution vendor. +\end{itemize} + +Now to return to the reasons for ordering groups of patches in this +way. We would like the lowest patches in the stack to be as stable as +possible, so that we will not need to rework higher patches due to +changes in context. Putting patches that will never be changed first +in the \sfilename{series} file serves this purpose. + +We would also like the patches that we know we'll need to modify to be +applied on top of a source tree that resembles the upstream tree as +closely as possible. This is why we keep accepted patches around for +a while. + +The ``backport'' and ``do not ship'' patches float at the end of the +\sfilename{series} file. The backport patches must be applied on top +of all other patches, and the ``do not ship'' patches might as well +stay out of harm's way. + +\section{Maintaining the patch series} + +In my work, I use a number of guards to control which patches are to +be applied. + +\begin{itemize} +\item ``Accepted'' patches are guarded with \texttt{accepted}. I + enable this guard most of the time. When I'm applying the patches + on top of a tree where the patches are already present, I can turn + this patch off, and the patches that follow it will apply cleanly. +\item Patches that are ``finished'', but not yet submitted, have no + guards. If I'm applying the patch stack to a copy of the upstream + tree, I don't need to enable any guards in order to get a reasonably + safe source tree. +\item Those patches that need reworking before being resubmitted are + guarded with \texttt{rework}. +\item For those patches that are still under development, I use + \texttt{devel}. +\item A backport patch may have several guards, one for each version + of the kernel to which it applies. For example, a patch that + backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard. +\end{itemize} +This variety of guards gives me considerable flexibility in +qdetermining what kind of source tree I want to end up with. For most +situations, the selection of appropriate guards is automated during +the build process, but I can manually tune the guards to use for less +common circumstances. + +\subsection{The art of writing backport patches} + +Using MQ, writing a backport patch is a simple process. All such a +patch has to do is modify a piece of code that uses a kernel feature +not present in the older version of the kernel, so that the driver +continues to work correctly under that older version. + +A useful goal when writing a good backport patch is to make your code +look as if it was written for the older version of the kernel you're +targeting. The less obtrusive the patch, the easier it will be to +understand and maintain. If you're writing a collection of backport +patches to avoid the ``rat's nest'' effect of lots of +\texttt{\#ifdef}s (hunks of source code that are only used +conditionally) in your code, don't introduce version-dependent +\texttt{\#ifdef}s into the patches. Instead, write several patches, +each of which makes unconditional changes, and control their +application using guards. + +There are two reasons to divide backport patches into a distinct +group, away from the ``regular'' patches whose effects they modify. +The first is that intermingling the two makes it more difficult to use +a tool like the \hgext{patchbomb} extension to automate the process of +submitting the patches to an upstream maintainer. The second is that +a backport patch could perturb the context in which a subsequent +regular patch is applied, making it impossible to apply the regular +patch cleanly \emph{without} the earlier backport patch already being +applied. + +\section{Useful tips for developing with MQ} + +\subsection{Organising patches in directories} + +If you're working on a substantial project with MQ, it's not difficult +to accumulate a large number of patches. For example, I have one +patch repository that contains over 250 patches. + +If you can group these patches into separate logical categories, you +can if you like store them in different directories; MQ has no +problems with patch names that contain path separators. + +\subsection{Viewing the history of a patch} +\label{mq-collab:tips:interdiff} + +If you're developing a set of patches over a long time, it's a good +idea to maintain them in a repository, as discussed in +section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that +using the \hgcmd{diff} command to look at the history of changes to a +patch is unworkable. This is in part because you're looking at the +second derivative of the real code (a diff of a diff), but also +because MQ adds noise to the process by modifying time stamps and +directory names when it updates a patch. + +However, you can use the \hgext{extdiff} extension, which is bundled +with Mercurial, to turn a diff of two versions of a patch into +something readable. To do this, you will need a third-party package +called \package{patchutils}~\cite{web:patchutils}. This provides a +command named \command{interdiff}, which shows the differences between +two diffs as a diff. Used on two versions of the same diff, it +generates a diff that represents the diff from the first to the second +version. + +You can enable the \hgext{extdiff} extension in the usual way, by +adding a line to the \rcsection{extensions} section of your \hgrc. +\begin{codesample2} + [extensions] + extdiff = +\end{codesample2} +The \command{interdiff} command expects to be passed the names of two +files, but the \hgext{extdiff} extension passes the program it runs a +pair of directories, each of which can contain an arbitrary number of +files. We thus need a small program that will run \command{interdiff} +on each pair of files in these two directories. This program is +available as \sfilename{hg-interdiff} in the \dirname{examples} +directory of the source code repository that accompanies this book. +\excode{hg-interdiff} + +With the \sfilename{hg-interdiff} program in your shell's search path, +you can run it as follows, from inside an MQ patch directory: +\begin{codesample2} + hg extdiff -p hg-interdiff -r A:B my-change.patch +\end{codesample2} +Since you'll probably want to use this long-winded command a lot, you +can get \hgext{hgext} to make it available as a normal Mercurial +command, again by editing your \hgrc. +\begin{codesample2} + [extdiff] + cmd.interdiff = hg-interdiff +\end{codesample2} +This directs \hgext{hgext} to make an \texttt{interdiff} command +available, so you can now shorten the previous invocation of +\hgxcmd{extdiff}{extdiff} to something a little more wieldy. +\begin{codesample2} + hg interdiff -r A:B my-change.patch +\end{codesample2} + +\begin{note} + The \command{interdiff} command works well only if the underlying + files against which versions of a patch are generated remain the + same. If you create a patch, modify the underlying files, and then + regenerate the patch, \command{interdiff} may not produce useful + output. +\end{note} + +The \hgext{extdiff} extension is useful for more than merely improving +the presentation of MQ~patches. To read more about it, go to +section~\ref{sec:hgext:extdiff}. + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 7f0af73f53ab -r 7e52f0cc4516 es/mq-ref.tex --- a/es/mq-ref.tex Sat Oct 18 14:35:43 2008 -0500 +++ b/es/mq-ref.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,349 @@ +\chapter{Mercurial Queues reference} +\label{chap:mqref} + +\section{MQ command reference} +\label{sec:mqref:cmdref} + +For an overview of the commands provided by MQ, use the command +\hgcmdargs{help}{mq}. + +\subsection{\hgxcmd{mq}{qapplied}---print applied patches} + +The \hgxcmd{mq}{qapplied} command prints the current stack of applied +patches. Patches are printed in oldest-to-newest order, so the last +patch in the list is the ``top'' patch. + +\subsection{\hgxcmd{mq}{qcommit}---commit changes in the queue repository} + +The \hgxcmd{mq}{qcommit} command commits any outstanding changes in the +\sdirname{.hg/patches} repository. This command only works if the +\sdirname{.hg/patches} directory is a repository, i.e.~you created the +directory using \hgcmdargs{qinit}{\hgxopt{mq}{qinit}{-c}} or ran +\hgcmd{init} in the directory after running \hgxcmd{mq}{qinit}. + +This command is shorthand for \hgcmdargs{commit}{--cwd .hg/patches}. + +\subsection{\hgxcmd{mq}{qdelete}---delete a patch from the + \sfilename{series} file} + +The \hgxcmd{mq}{qdelete} command removes the entry for a patch from the +\sfilename{series} file in the \sdirname{.hg/patches} directory. It +does not pop the patch if the patch is already applied. By default, +it does not delete the patch file; use the \hgxopt{mq}{qdel}{-f} option to +do that. + +Options: +\begin{itemize} +\item[\hgxopt{mq}{qdel}{-f}] Delete the patch file. +\end{itemize} + +\subsection{\hgxcmd{mq}{qdiff}---print a diff of the topmost applied patch} + +The \hgxcmd{mq}{qdiff} command prints a diff of the topmost applied patch. +It is equivalent to \hgcmdargs{diff}{-r-2:-1}. + +\subsection{\hgxcmd{mq}{qfold}---merge (``fold'') several patches into one} + +The \hgxcmd{mq}{qfold} command merges multiple patches into the topmost +applied patch, so that the topmost applied patch makes the union of +all of the changes in the patches in question. + +The patches to fold must not be applied; \hgxcmd{mq}{qfold} will exit with +an error if any is. The order in which patches are folded is +significant; \hgcmdargs{qfold}{a b} means ``apply the current topmost +patch, followed by \texttt{a}, followed by \texttt{b}''. + +The comments from the folded patches are appended to the comments of +the destination patch, with each block of comments separated by three +asterisk (``\texttt{*}'') characters. Use the \hgxopt{mq}{qfold}{-e} +option to edit the commit message for the combined patch/changeset +after the folding has completed. + +Options: +\begin{itemize} +\item[\hgxopt{mq}{qfold}{-e}] Edit the commit message and patch description + for the newly folded patch. +\item[\hgxopt{mq}{qfold}{-l}] Use the contents of the given file as the new + commit message and patch description for the folded patch. +\item[\hgxopt{mq}{qfold}{-m}] Use the given text as the new commit message + and patch description for the folded patch. +\end{itemize} + +\subsection{\hgxcmd{mq}{qheader}---display the header/description of a patch} + +The \hgxcmd{mq}{qheader} command prints the header, or description, of a +patch. By default, it prints the header of the topmost applied patch. +Given an argument, it prints the header of the named patch. + +\subsection{\hgxcmd{mq}{qimport}---import a third-party patch into the queue} + +The \hgxcmd{mq}{qimport} command adds an entry for an external patch to the +\sfilename{series} file, and copies the patch into the +\sdirname{.hg/patches} directory. It adds the entry immediately after +the topmost applied patch, but does not push the patch. + +If the \sdirname{.hg/patches} directory is a repository, +\hgxcmd{mq}{qimport} automatically does an \hgcmd{add} of the imported +patch. + +\subsection{\hgxcmd{mq}{qinit}---prepare a repository to work with MQ} + +The \hgxcmd{mq}{qinit} command prepares a repository to work with MQ. It +creates a directory called \sdirname{.hg/patches}. + +Options: +\begin{itemize} +\item[\hgxopt{mq}{qinit}{-c}] Create \sdirname{.hg/patches} as a repository + in its own right. Also creates a \sfilename{.hgignore} file that + will ignore the \sfilename{status} file. +\end{itemize} + +When the \sdirname{.hg/patches} directory is a repository, the +\hgxcmd{mq}{qimport} and \hgxcmd{mq}{qnew} commands automatically \hgcmd{add} +new patches. + +\subsection{\hgxcmd{mq}{qnew}---create a new patch} + +The \hgxcmd{mq}{qnew} command creates a new patch. It takes one mandatory +argument, the name to use for the patch file. The newly created patch +is created empty by default. It is added to the \sfilename{series} +file after the current topmost applied patch, and is immediately +pushed on top of that patch. + +If \hgxcmd{mq}{qnew} finds modified files in the working directory, it will +refuse to create a new patch unless the \hgxopt{mq}{qnew}{-f} option is +used (see below). This behaviour allows you to \hgxcmd{mq}{qrefresh} your +topmost applied patch before you apply a new patch on top of it. + +Options: +\begin{itemize} +\item[\hgxopt{mq}{qnew}{-f}] Create a new patch if the contents of the + working directory are modified. Any outstanding modifications are + added to the newly created patch, so after this command completes, + the working directory will no longer be modified. +\item[\hgxopt{mq}{qnew}{-m}] Use the given text as the commit message. + This text will be stored at the beginning of the patch file, before + the patch data. +\end{itemize} + +\subsection{\hgxcmd{mq}{qnext}---print the name of the next patch} + +The \hgxcmd{mq}{qnext} command prints the name name of the next patch in +the \sfilename{series} file after the topmost applied patch. This +patch will become the topmost applied patch if you run \hgxcmd{mq}{qpush}. + +\subsection{\hgxcmd{mq}{qpop}---pop patches off the stack} + +The \hgxcmd{mq}{qpop} command removes applied patches from the top of the +stack of applied patches. By default, it removes only one patch. + +This command removes the changesets that represent the popped patches +from the repository, and updates the working directory to undo the +effects of the patches. + +This command takes an optional argument, which it uses as the name or +index of the patch to pop to. If given a name, it will pop patches +until the named patch is the topmost applied patch. If given a +number, \hgxcmd{mq}{qpop} treats the number as an index into the entries in +the series file, counting from zero (empty lines and lines containing +only comments do not count). It pops patches until the patch +identified by the given index is the topmost applied patch. + +The \hgxcmd{mq}{qpop} command does not read or write patches or the +\sfilename{series} file. It is thus safe to \hgxcmd{mq}{qpop} a patch that +you have removed from the \sfilename{series} file, or a patch that you +have renamed or deleted entirely. In the latter two cases, use the +name of the patch as it was when you applied it. + +By default, the \hgxcmd{mq}{qpop} command will not pop any patches if the +working directory has been modified. You can override this behaviour +using the \hgxopt{mq}{qpop}{-f} option, which reverts all modifications in +the working directory. + +Options: +\begin{itemize} +\item[\hgxopt{mq}{qpop}{-a}] Pop all applied patches. This returns the + repository to its state before you applied any patches. +\item[\hgxopt{mq}{qpop}{-f}] Forcibly revert any modifications to the + working directory when popping. +\item[\hgxopt{mq}{qpop}{-n}] Pop a patch from the named queue. +\end{itemize} + +The \hgxcmd{mq}{qpop} command removes one line from the end of the +\sfilename{status} file for each patch that it pops. + +\subsection{\hgxcmd{mq}{qprev}---print the name of the previous patch} + +The \hgxcmd{mq}{qprev} command prints the name of the patch in the +\sfilename{series} file that comes before the topmost applied patch. +This will become the topmost applied patch if you run \hgxcmd{mq}{qpop}. + +\subsection{\hgxcmd{mq}{qpush}---push patches onto the stack} +\label{sec:mqref:cmd:qpush} + +The \hgxcmd{mq}{qpush} command adds patches onto the applied stack. By +default, it adds only one patch. + +This command creates a new changeset to represent each applied patch, +and updates the working directory to apply the effects of the patches. + +The default data used when creating a changeset are as follows: +\begin{itemize} +\item The commit date and time zone are the current date and time + zone. Because these data are used to compute the identity of a + changeset, this means that if you \hgxcmd{mq}{qpop} a patch and + \hgxcmd{mq}{qpush} it again, the changeset that you push will have a + different identity than the changeset you popped. +\item The author is the same as the default used by the \hgcmd{commit} + command. +\item The commit message is any text from the patch file that comes + before the first diff header. If there is no such text, a default + commit message is used that identifies the name of the patch. +\end{itemize} +If a patch contains a Mercurial patch header (XXX add link), the +information in the patch header overrides these defaults. + +Options: +\begin{itemize} +\item[\hgxopt{mq}{qpush}{-a}] Push all unapplied patches from the + \sfilename{series} file until there are none left to push. +\item[\hgxopt{mq}{qpush}{-l}] Add the name of the patch to the end + of the commit message. +\item[\hgxopt{mq}{qpush}{-m}] If a patch fails to apply cleanly, use the + entry for the patch in another saved queue to compute the parameters + for a three-way merge, and perform a three-way merge using the + normal Mercurial merge machinery. Use the resolution of the merge + as the new patch content. +\item[\hgxopt{mq}{qpush}{-n}] Use the named queue if merging while pushing. +\end{itemize} + +The \hgxcmd{mq}{qpush} command reads, but does not modify, the +\sfilename{series} file. It appends one line to the \hgcmd{status} +file for each patch that it pushes. + +\subsection{\hgxcmd{mq}{qrefresh}---update the topmost applied patch} + +The \hgxcmd{mq}{qrefresh} command updates the topmost applied patch. It +modifies the patch, removes the old changeset that represented the +patch, and creates a new changeset to represent the modified patch. + +The \hgxcmd{mq}{qrefresh} command looks for the following modifications: +\begin{itemize} +\item Changes to the commit message, i.e.~the text before the first + diff header in the patch file, are reflected in the new changeset + that represents the patch. +\item Modifications to tracked files in the working directory are + added to the patch. +\item Changes to the files tracked using \hgcmd{add}, \hgcmd{copy}, + \hgcmd{remove}, or \hgcmd{rename}. Added files and copy and rename + destinations are added to the patch, while removed files and rename + sources are removed. +\end{itemize} + +Even if \hgxcmd{mq}{qrefresh} detects no changes, it still recreates the +changeset that represents the patch. This causes the identity of the +changeset to differ from the previous changeset that identified the +patch. + +Options: +\begin{itemize} +\item[\hgxopt{mq}{qrefresh}{-e}] Modify the commit and patch description, + using the preferred text editor. +\item[\hgxopt{mq}{qrefresh}{-m}] Modify the commit message and patch + description, using the given text. +\item[\hgxopt{mq}{qrefresh}{-l}] Modify the commit message and patch + description, using text from the given file. +\end{itemize} + +\subsection{\hgxcmd{mq}{qrename}---rename a patch} + +The \hgxcmd{mq}{qrename} command renames a patch, and changes the entry for +the patch in the \sfilename{series} file. + +With a single argument, \hgxcmd{mq}{qrename} renames the topmost applied +patch. With two arguments, it renames its first argument to its +second. + +\subsection{\hgxcmd{mq}{qrestore}---restore saved queue state} + +XXX No idea what this does. + +\subsection{\hgxcmd{mq}{qsave}---save current queue state} + +XXX Likewise. + +\subsection{\hgxcmd{mq}{qseries}---print the entire patch series} + +The \hgxcmd{mq}{qseries} command prints the entire patch series from the +\sfilename{series} file. It prints only patch names, not empty lines +or comments. It prints in order from first to be applied to last. + +\subsection{\hgxcmd{mq}{qtop}---print the name of the current patch} + +The \hgxcmd{mq}{qtop} prints the name of the topmost currently applied +patch. + +\subsection{\hgxcmd{mq}{qunapplied}---print patches not yet applied} + +The \hgxcmd{mq}{qunapplied} command prints the names of patches from the +\sfilename{series} file that are not yet applied. It prints them in +order from the next patch that will be pushed to the last. + +\subsection{\hgcmd{strip}---remove a revision and descendants} + +The \hgcmd{strip} command removes a revision, and all of its +descendants, from the repository. It undoes the effects of the +removed revisions from the repository, and updates the working +directory to the first parent of the removed revision. + +The \hgcmd{strip} command saves a backup of the removed changesets in +a bundle, so that they can be reapplied if removed in error. + +Options: +\begin{itemize} +\item[\hgopt{strip}{-b}] Save unrelated changesets that are intermixed + with the stripped changesets in the backup bundle. +\item[\hgopt{strip}{-f}] If a branch has multiple heads, remove all + heads. XXX This should be renamed, and use \texttt{-f} to strip revs + when there are pending changes. +\item[\hgopt{strip}{-n}] Do not save a backup bundle. +\end{itemize} + +\section{MQ file reference} + +\subsection{The \sfilename{series} file} + +The \sfilename{series} file contains a list of the names of all +patches that MQ can apply. It is represented as a list of names, with +one name saved per line. Leading and trailing white space in each +line are ignored. + +Lines may contain comments. A comment begins with the ``\texttt{\#}'' +character, and extends to the end of the line. Empty lines, and lines +that contain only comments, are ignored. + +You will often need to edit the \sfilename{series} file by hand, hence +the support for comments and empty lines noted above. For example, +you can comment out a patch temporarily, and \hgxcmd{mq}{qpush} will skip +over that patch when applying patches. You can also change the order +in which patches are applied by reordering their entries in the +\sfilename{series} file. + +Placing the \sfilename{series} file under revision control is also +supported; it is a good idea to place all of the patches that it +refers to under revision control, as well. If you create a patch +directory using the \hgxopt{mq}{qinit}{-c} option to \hgxcmd{mq}{qinit}, this +will be done for you automatically. + +\subsection{The \sfilename{status} file} + +The \sfilename{status} file contains the names and changeset hashes of +all patches that MQ currently has applied. Unlike the +\sfilename{series} file, this file is not intended for editing. You +should not place this file under revision control, or modify it in any +way. It is used by MQ strictly for internal book-keeping. + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 7f0af73f53ab -r 7e52f0cc4516 es/mq.tex --- a/es/mq.tex Sat Oct 18 14:35:43 2008 -0500 +++ b/es/mq.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,1043 @@ +\chapter{Managing change with Mercurial Queues} +\label{chap:mq} + +\section{The patch management problem} +\label{sec:mq:patch-mgmt} + +Here is a common scenario: you need to install a software package from +source, but you find a bug that you must fix in the source before you +can start using the package. You make your changes, forget about the +package for a while, and a few months later you need to upgrade to a +newer version of the package. If the newer version of the package +still has the bug, you must extract your fix from the older source +tree and apply it against the newer version. This is a tedious task, +and it's easy to make mistakes. + +This is a simple case of the ``patch management'' problem. You have +an ``upstream'' source tree that you can't change; you need to make +some local changes on top of the upstream tree; and you'd like to be +able to keep those changes separate, so that you can apply them to +newer versions of the upstream source. + +The patch management problem arises in many situations. Probably the +most visible is that a user of an open source software project will +contribute a bug fix or new feature to the project's maintainers in the +form of a patch. + +Distributors of operating systems that include open source software +often need to make changes to the packages they distribute so that +they will build properly in their environments. + +When you have few changes to maintain, it is easy to manage a single +patch using the standard \command{diff} and \command{patch} programs +(see section~\ref{sec:mq:patch} for a discussion of these tools). +Once the number of changes grows, it starts to make sense to maintain +patches as discrete ``chunks of work,'' so that for example a single +patch will contain only one bug fix (the patch might modify several +files, but it's doing ``only one thing''), and you may have a number +of such patches for different bugs you need fixed and local changes +you require. In this situation, if you submit a bug fix patch to the +upstream maintainers of a package and they include your fix in a +subsequent release, you can simply drop that single patch when you're +updating to the newer release. + +Maintaining a single patch against an upstream tree is a little +tedious and error-prone, but not difficult. However, the complexity +of the problem grows rapidly as the number of patches you have to +maintain increases. With more than a tiny number of patches in hand, +understanding which ones you have applied and maintaining them moves +from messy to overwhelming. + +Fortunately, Mercurial includes a powerful extension, Mercurial Queues +(or simply ``MQ''), that massively simplifies the patch management +problem. + +\section{The prehistory of Mercurial Queues} +\label{sec:mq:history} + +During the late 1990s, several Linux kernel developers started to +maintain ``patch series'' that modified the behaviour of the Linux +kernel. Some of these series were focused on stability, some on +feature coverage, and others were more speculative. + +The sizes of these patch series grew rapidly. In 2002, Andrew Morton +published some shell scripts he had been using to automate the task of +managing his patch queues. Andrew was successfully using these +scripts to manage hundreds (sometimes thousands) of patches on top of +the Linux kernel. + +\subsection{A patchwork quilt} +\label{sec:mq:quilt} + +In early 2003, Andreas Gruenbacher and Martin Quinson borrowed the +approach of Andrew's scripts and published a tool called ``patchwork +quilt''~\cite{web:quilt}, or simply ``quilt'' +(see~\cite{gruenbacher:2005} for a paper describing it). Because +quilt substantially automated patch management, it rapidly gained a +large following among open source software developers. + +Quilt manages a \emph{stack of patches} on top of a directory tree. +To begin, you tell quilt to manage a directory tree, and tell it which +files you want to manage; it stores away the names and contents of +those files. To fix a bug, you create a new patch (using a single +command), edit the files you need to fix, then ``refresh'' the patch. + +The refresh step causes quilt to scan the directory tree; it updates +the patch with all of the changes you have made. You can create +another patch on top of the first, which will track the changes +required to modify the tree from ``tree with one patch applied'' to +``tree with two patches applied''. + +You can \emph{change} which patches are applied to the tree. If you +``pop'' a patch, the changes made by that patch will vanish from the +directory tree. Quilt remembers which patches you have popped, +though, so you can ``push'' a popped patch again, and the directory +tree will be restored to contain the modifications in the patch. Most +importantly, you can run the ``refresh'' command at any time, and the +topmost applied patch will be updated. This means that you can, at +any time, change both which patches are applied and what +modifications those patches make. + +Quilt knows nothing about revision control tools, so it works equally +well on top of an unpacked tarball or a Subversion working copy. + +\subsection{From patchwork quilt to Mercurial Queues} +\label{sec:mq:quilt-mq} + +In mid-2005, Chris Mason took the features of quilt and wrote an +extension that he called Mercurial Queues, which added quilt-like +behaviour to Mercurial. + +The key difference between quilt and MQ is that quilt knows nothing +about revision control systems, while MQ is \emph{integrated} into +Mercurial. Each patch that you push is represented as a Mercurial +changeset. Pop a patch, and the changeset goes away. + +Because quilt does not care about revision control tools, it is still +a tremendously useful piece of software to know about for situations +where you cannot use Mercurial and MQ. + +\section{The huge advantage of MQ} + +I cannot overstate the value that MQ offers through the unification of +patches and revision control. + +A major reason that patches have persisted in the free software and +open source world---in spite of the availability of increasingly +capable revision control tools over the years---is the \emph{agility} +they offer. + +Traditional revision control tools make a permanent, irreversible +record of everything that you do. While this has great value, it's +also somewhat stifling. If you want to perform a wild-eyed +experiment, you have to be careful in how you go about it, or you risk +leaving unneeded---or worse, misleading or destabilising---traces of +your missteps and errors in the permanent revision record. + +By contrast, MQ's marriage of distributed revision control with +patches makes it much easier to isolate your work. Your patches live +on top of normal revision history, and you can make them disappear or +reappear at will. If you don't like a patch, you can drop it. If a +patch isn't quite as you want it to be, simply fix it---as many times +as you need to, until you have refined it into the form you desire. + +As an example, the integration of patches with revision control makes +understanding patches and debugging their effects---and their +interplay with the code they're based on---\emph{enormously} easier. +Since every applied patch has an associated changeset, you can use +\hgcmdargs{log}{\emph{filename}} to see which changesets and patches +affected a file. You can use the \hgext{bisect} command to +binary-search through all changesets and applied patches to see where +a bug got introduced or fixed. You can use the \hgcmd{annotate} +command to see which changeset or patch modified a particular line of +a source file. And so on. + +\section{Understanding patches} +\label{sec:mq:patch} + +Because MQ doesn't hide its patch-oriented nature, it is helpful to +understand what patches are, and a little about the tools that work +with them. + +The traditional Unix \command{diff} command compares two files, and +prints a list of differences between them. The \command{patch} command +understands these differences as \emph{modifications} to make to a +file. Take a look at figure~\ref{ex:mq:diff} for a simple example of +these commands in action. + +\begin{figure}[ht] + \interaction{mq.dodiff.diff} + \caption{Simple uses of the \command{diff} and \command{patch} commands} + \label{ex:mq:diff} +\end{figure} + +The type of file that \command{diff} generates (and \command{patch} +takes as input) is called a ``patch'' or a ``diff''; there is no +difference between a patch and a diff. (We'll use the term ``patch'', +since it's more commonly used.) + +A patch file can start with arbitrary text; the \command{patch} +command ignores this text, but MQ uses it as the commit message when +creating changesets. To find the beginning of the patch content, +\command{patch} searches for the first line that starts with the +string ``\texttt{diff~-}''. + +MQ works with \emph{unified} diffs (\command{patch} can accept several +other diff formats, but MQ doesn't). A unified diff contains two +kinds of header. The \emph{file header} describes the file being +modified; it contains the name of the file to modify. When +\command{patch} sees a new file header, it looks for a file with that +name to start modifying. + +After the file header comes a series of \emph{hunks}. Each hunk +starts with a header; this identifies the range of line numbers within +the file that the hunk should modify. Following the header, a hunk +starts and ends with a few (usually three) lines of text from the +unmodified file; these are called the \emph{context} for the hunk. If +there's only a small amount of context between successive hunks, +\command{diff} doesn't print a new hunk header; it just runs the hunks +together, with a few lines of context between modifications. + +Each line of context begins with a space character. Within the hunk, +a line that begins with ``\texttt{-}'' means ``remove this line,'' +while a line that begins with ``\texttt{+}'' means ``insert this +line.'' For example, a line that is modified is represented by one +deletion and one insertion. + +We will return to some of the more subtle aspects of patches later (in +section~\ref{sec:mq:adv-patch}), but you should have enough information +now to use MQ. + +\section{Getting started with Mercurial Queues} +\label{sec:mq:start} + +Because MQ is implemented as an extension, you must explicitly enable +before you can use it. (You don't need to download anything; MQ ships +with the standard Mercurial distribution.) To enable MQ, edit your +\tildefile{.hgrc} file, and add the lines in figure~\ref{ex:mq:config}. + +\begin{figure}[ht] + \begin{codesample4} + [extensions] + hgext.mq = + \end{codesample4} + \label{ex:mq:config} + \caption{Contents to add to \tildefile{.hgrc} to enable the MQ extension} +\end{figure} + +Once the extension is enabled, it will make a number of new commands +available. To verify that the extension is working, you can use +\hgcmd{help} to see if the \hgxcmd{mq}{qinit} command is now available; see +the example in figure~\ref{ex:mq:enabled}. + +\begin{figure}[ht] + \interaction{mq.qinit-help.help} + \caption{How to verify that MQ is enabled} + \label{ex:mq:enabled} +\end{figure} + +You can use MQ with \emph{any} Mercurial repository, and its commands +only operate within that repository. To get started, simply prepare +the repository using the \hgxcmd{mq}{qinit} command (see +figure~\ref{ex:mq:qinit}). This command creates an empty directory +called \sdirname{.hg/patches}, where MQ will keep its metadata. As +with many Mercurial commands, the \hgxcmd{mq}{qinit} command prints nothing +if it succeeds. + +\begin{figure}[ht] + \interaction{mq.tutorial.qinit} + \caption{Preparing a repository for use with MQ} + \label{ex:mq:qinit} +\end{figure} + +\begin{figure}[ht] + \interaction{mq.tutorial.qnew} + \caption{Creating a new patch} + \label{ex:mq:qnew} +\end{figure} + +\subsection{Creating a new patch} + +To begin work on a new patch, use the \hgxcmd{mq}{qnew} command. This +command takes one argument, the name of the patch to create. MQ will +use this as the name of an actual file in the \sdirname{.hg/patches} +directory, as you can see in figure~\ref{ex:mq:qnew}. + +Also newly present in the \sdirname{.hg/patches} directory are two +other files, \sfilename{series} and \sfilename{status}. The +\sfilename{series} file lists all of the patches that MQ knows about +for this repository, with one patch per line. Mercurial uses the +\sfilename{status} file for internal book-keeping; it tracks all of the +patches that MQ has \emph{applied} in this repository. + +\begin{note} + You may sometimes want to edit the \sfilename{series} file by hand; + for example, to change the sequence in which some patches are + applied. However, manually editing the \sfilename{status} file is + almost always a bad idea, as it's easy to corrupt MQ's idea of what + is happening. +\end{note} + +Once you have created your new patch, you can edit files in the +working directory as you usually would. All of the normal Mercurial +commands, such as \hgcmd{diff} and \hgcmd{annotate}, work exactly as +they did before. + +\subsection{Refreshing a patch} + +When you reach a point where you want to save your work, use the +\hgxcmd{mq}{qrefresh} command (figure~\ref{ex:mq:qnew}) to update the patch +you are working on. This command folds the changes you have made in +the working directory into your patch, and updates its corresponding +changeset to contain those changes. + +\begin{figure}[ht] + \interaction{mq.tutorial.qrefresh} + \caption{Refreshing a patch} + \label{ex:mq:qrefresh} +\end{figure} + +You can run \hgxcmd{mq}{qrefresh} as often as you like, so it's a good way +to ``checkpoint'' your work. Refresh your patch at an opportune +time; try an experiment; and if the experiment doesn't work out, +\hgcmd{revert} your modifications back to the last time you refreshed. + +\begin{figure}[ht] + \interaction{mq.tutorial.qrefresh2} + \caption{Refresh a patch many times to accumulate changes} + \label{ex:mq:qrefresh2} +\end{figure} + +\subsection{Stacking and tracking patches} + +Once you have finished working on a patch, or need to work on another, +you can use the \hgxcmd{mq}{qnew} command again to create a new patch. +Mercurial will apply this patch on top of your existing patch. See +figure~\ref{ex:mq:qnew2} for an example. Notice that the patch +contains the changes in our prior patch as part of its context (you +can see this more clearly in the output of \hgcmd{annotate}). + +\begin{figure}[ht] + \interaction{mq.tutorial.qnew2} + \caption{Stacking a second patch on top of the first} + \label{ex:mq:qnew2} +\end{figure} + +So far, with the exception of \hgxcmd{mq}{qnew} and \hgxcmd{mq}{qrefresh}, we've +been careful to only use regular Mercurial commands. However, MQ +provides many commands that are easier to use when you are thinking +about patches, as illustrated in figure~\ref{ex:mq:qseries}: + +\begin{itemize} +\item The \hgxcmd{mq}{qseries} command lists every patch that MQ knows + about in this repository, from oldest to newest (most recently + \emph{created}). +\item The \hgxcmd{mq}{qapplied} command lists every patch that MQ has + \emph{applied} in this repository, again from oldest to newest (most + recently applied). +\end{itemize} + +\begin{figure}[ht] + \interaction{mq.tutorial.qseries} + \caption{Understanding the patch stack with \hgxcmd{mq}{qseries} and + \hgxcmd{mq}{qapplied}} + \label{ex:mq:qseries} +\end{figure} + +\subsection{Manipulating the patch stack} + +The previous discussion implied that there must be a difference +between ``known'' and ``applied'' patches, and there is. MQ can +manage a patch without it being applied in the repository. + +An \emph{applied} patch has a corresponding changeset in the +repository, and the effects of the patch and changeset are visible in +the working directory. You can undo the application of a patch using +the \hgxcmd{mq}{qpop} command. MQ still \emph{knows about}, or manages, a +popped patch, but the patch no longer has a corresponding changeset in +the repository, and the working directory does not contain the changes +made by the patch. Figure~\ref{fig:mq:stack} illustrates the +difference between applied and tracked patches. + +\begin{figure}[ht] + \centering + \grafix{mq-stack} + \caption{Applied and unapplied patches in the MQ patch stack} + \label{fig:mq:stack} +\end{figure} + +You can reapply an unapplied, or popped, patch using the \hgxcmd{mq}{qpush} +command. This creates a new changeset to correspond to the patch, and +the patch's changes once again become present in the working +directory. See figure~\ref{ex:mq:qpop} for examples of \hgxcmd{mq}{qpop} +and \hgxcmd{mq}{qpush} in action. Notice that once we have popped a patch +or two patches, the output of \hgxcmd{mq}{qseries} remains the same, while +that of \hgxcmd{mq}{qapplied} has changed. + +\begin{figure}[ht] + \interaction{mq.tutorial.qpop} + \caption{Modifying the stack of applied patches} + \label{ex:mq:qpop} +\end{figure} + +\subsection{Pushing and popping many patches} + +While \hgxcmd{mq}{qpush} and \hgxcmd{mq}{qpop} each operate on a single patch at +a time by default, you can push and pop many patches in one go. The +\hgxopt{mq}{qpush}{-a} option to \hgxcmd{mq}{qpush} causes it to push all +unapplied patches, while the \hgxopt{mq}{qpop}{-a} option to \hgxcmd{mq}{qpop} +causes it to pop all applied patches. (For some more ways to push and +pop many patches, see section~\ref{sec:mq:perf} below.) + +\begin{figure}[ht] + \interaction{mq.tutorial.qpush-a} + \caption{Pushing all unapplied patches} + \label{ex:mq:qpush-a} +\end{figure} + +\subsection{Safety checks, and overriding them} + +Several MQ commands check the working directory before they do +anything, and fail if they find any modifications. They do this to +ensure that you won't lose any changes that you have made, but not yet +incorporated into a patch. Figure~\ref{ex:mq:add} illustrates this; +the \hgxcmd{mq}{qnew} command will not create a new patch if there are +outstanding changes, caused in this case by the \hgcmd{add} of +\filename{file3}. + +\begin{figure}[ht] + \interaction{mq.tutorial.add} + \caption{Forcibly creating a patch} + \label{ex:mq:add} +\end{figure} + +Commands that check the working directory all take an ``I know what +I'm doing'' option, which is always named \option{-f}. The exact +meaning of \option{-f} depends on the command. For example, +\hgcmdargs{qnew}{\hgxopt{mq}{qnew}{-f}} will incorporate any outstanding +changes into the new patch it creates, but +\hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-f}} will revert modifications to any +files affected by the patch that it is popping. Be sure to read the +documentation for a command's \option{-f} option before you use it! + +\subsection{Working on several patches at once} + +The \hgxcmd{mq}{qrefresh} command always refreshes the \emph{topmost} +applied patch. This means that you can suspend work on one patch (by +refreshing it), pop or push to make a different patch the top, and +work on \emph{that} patch for a while. + +Here's an example that illustrates how you can use this ability. +Let's say you're developing a new feature as two patches. The first +is a change to the core of your software, and the second---layered on +top of the first---changes the user interface to use the code you just +added to the core. If you notice a bug in the core while you're +working on the UI patch, it's easy to fix the core. Simply +\hgxcmd{mq}{qrefresh} the UI patch to save your in-progress changes, and +\hgxcmd{mq}{qpop} down to the core patch. Fix the core bug, +\hgxcmd{mq}{qrefresh} the core patch, and \hgxcmd{mq}{qpush} back to the UI +patch to continue where you left off. + +\section{More about patches} +\label{sec:mq:adv-patch} + +MQ uses the GNU \command{patch} command to apply patches, so it's +helpful to know a few more detailed aspects of how \command{patch} +works, and about patches themselves. + +\subsection{The strip count} + +If you look at the file headers in a patch, you will notice that the +pathnames usually have an extra component on the front that isn't +present in the actual path name. This is a holdover from the way that +people used to generate patches (people still do this, but it's +somewhat rare with modern revision control tools). + +Alice would unpack a tarball, edit her files, then decide that she +wanted to create a patch. So she'd rename her working directory, +unpack the tarball again (hence the need for the rename), and use the +\cmdopt{diff}{-r} and \cmdopt{diff}{-N} options to \command{diff} to +recursively generate a patch between the unmodified directory and the +modified one. The result would be that the name of the unmodified +directory would be at the front of the left-hand path in every file +header, and the name of the modified directory would be at the front +of the right-hand path. + +Since someone receiving a patch from the Alices of the net would be +unlikely to have unmodified and modified directories with exactly the +same names, the \command{patch} command has a \cmdopt{patch}{-p} +option that indicates the number of leading path name components to +strip when trying to apply a patch. This number is called the +\emph{strip count}. + +An option of ``\texttt{-p1}'' means ``use a strip count of one''. If +\command{patch} sees a file name \filename{foo/bar/baz} in a file +header, it will strip \filename{foo} and try to patch a file named +\filename{bar/baz}. (Strictly speaking, the strip count refers to the +number of \emph{path separators} (and the components that go with them +) to strip. A strip count of one will turn \filename{foo/bar} into +\filename{bar}, but \filename{/foo/bar} (notice the extra leading +slash) into \filename{foo/bar}.) + +The ``standard'' strip count for patches is one; almost all patches +contain one leading path name component that needs to be stripped. +Mercurial's \hgcmd{diff} command generates path names in this form, +and the \hgcmd{import} command and MQ expect patches to have a strip +count of one. + +If you receive a patch from someone that you want to add to your patch +queue, and the patch needs a strip count other than one, you cannot +just \hgxcmd{mq}{qimport} the patch, because \hgxcmd{mq}{qimport} does not yet +have a \texttt{-p} option (see~\bug{311}). Your best bet is to +\hgxcmd{mq}{qnew} a patch of your own, then use \cmdargs{patch}{-p\emph{N}} +to apply their patch, followed by \hgcmd{addremove} to pick up any +files added or removed by the patch, followed by \hgxcmd{mq}{qrefresh}. +This complexity may become unnecessary; see~\bug{311} for details. +\subsection{Strategies for applying a patch} + +When \command{patch} applies a hunk, it tries a handful of +successively less accurate strategies to try to make the hunk apply. +This falling-back technique often makes it possible to take a patch +that was generated against an old version of a file, and apply it +against a newer version of that file. + +First, \command{patch} tries an exact match, where the line numbers, +the context, and the text to be modified must apply exactly. If it +cannot make an exact match, it tries to find an exact match for the +context, without honouring the line numbering information. If this +succeeds, it prints a line of output saying that the hunk was applied, +but at some \emph{offset} from the original line number. + +If a context-only match fails, \command{patch} removes the first and +last lines of the context, and tries a \emph{reduced} context-only +match. If the hunk with reduced context succeeds, it prints a message +saying that it applied the hunk with a \emph{fuzz factor} (the number +after the fuzz factor indicates how many lines of context +\command{patch} had to trim before the patch applied). + +When neither of these techniques works, \command{patch} prints a +message saying that the hunk in question was rejected. It saves +rejected hunks (also simply called ``rejects'') to a file with the +same name, and an added \sfilename{.rej} extension. It also saves an +unmodified copy of the file with a \sfilename{.orig} extension; the +copy of the file without any extensions will contain any changes made +by hunks that \emph{did} apply cleanly. If you have a patch that +modifies \filename{foo} with six hunks, and one of them fails to +apply, you will have: an unmodified \filename{foo.orig}, a +\filename{foo.rej} containing one hunk, and \filename{foo}, containing +the changes made by the five successful five hunks. + +\subsection{Some quirks of patch representation} + +There are a few useful things to know about how \command{patch} works +with files. +\begin{itemize} +\item This should already be obvious, but \command{patch} cannot + handle binary files. +\item Neither does it care about the executable bit; it creates new + files as readable, but not executable. +\item \command{patch} treats the removal of a file as a diff between + the file to be removed and the empty file. So your idea of ``I + deleted this file'' looks like ``every line of this file was + deleted'' in a patch. +\item It treats the addition of a file as a diff between the empty + file and the file to be added. So in a patch, your idea of ``I + added this file'' looks like ``every line of this file was added''. +\item It treats a renamed file as the removal of the old name, and the + addition of the new name. This means that renamed files have a big + footprint in patches. (Note also that Mercurial does not currently + try to infer when files have been renamed or copied in a patch.) +\item \command{patch} cannot represent empty files, so you cannot use + a patch to represent the notion ``I added this empty file to the + tree''. +\end{itemize} +\subsection{Beware the fuzz} + +While applying a hunk at an offset, or with a fuzz factor, will often +be completely successful, these inexact techniques naturally leave +open the possibility of corrupting the patched file. The most common +cases typically involve applying a patch twice, or at an incorrect +location in the file. If \command{patch} or \hgxcmd{mq}{qpush} ever +mentions an offset or fuzz factor, you should make sure that the +modified files are correct afterwards. + +It's often a good idea to refresh a patch that has applied with an +offset or fuzz factor; refreshing the patch generates new context +information that will make it apply cleanly. I say ``often,'' not +``always,'' because sometimes refreshing a patch will make it fail to +apply against a different revision of the underlying files. In some +cases, such as when you're maintaining a patch that must sit on top of +multiple versions of a source tree, it's acceptable to have a patch +apply with some fuzz, provided you've verified the results of the +patching process in such cases. + +\subsection{Handling rejection} + +If \hgxcmd{mq}{qpush} fails to apply a patch, it will print an error +message and exit. If it has left \sfilename{.rej} files behind, it is +usually best to fix up the rejected hunks before you push more patches +or do any further work. + +If your patch \emph{used to} apply cleanly, and no longer does because +you've changed the underlying code that your patches are based on, +Mercurial Queues can help; see section~\ref{sec:mq:merge} for details. + +Unfortunately, there aren't any great techniques for dealing with +rejected hunks. Most often, you'll need to view the \sfilename{.rej} +file and edit the target file, applying the rejected hunks by hand. + +If you're feeling adventurous, Neil Brown, a Linux kernel hacker, +wrote a tool called \command{wiggle}~\cite{web:wiggle}, which is more +vigorous than \command{patch} in its attempts to make a patch apply. + +Another Linux kernel hacker, Chris Mason (the author of Mercurial +Queues), wrote a similar tool called +\command{mpatch}~\cite{web:mpatch}, which takes a simple approach to +automating the application of hunks rejected by \command{patch}. The +\command{mpatch} command can help with four common reasons that a hunk +may be rejected: + +\begin{itemize} +\item The context in the middle of a hunk has changed. +\item A hunk is missing some context at the beginning or end. +\item A large hunk might apply better---either entirely or in + part---if it was broken up into smaller hunks. +\item A hunk removes lines with slightly different content than those + currently present in the file. +\end{itemize} + +If you use \command{wiggle} or \command{mpatch}, you should be doubly +careful to check your results when you're done. In fact, +\command{mpatch} enforces this method of double-checking the tool's +output, by automatically dropping you into a merge program when it has +done its job, so that you can verify its work and finish off any +remaining merges. + +\section{Getting the best performance out of MQ} +\label{sec:mq:perf} + +MQ is very efficient at handling a large number of patches. I ran +some performance experiments in mid-2006 for a talk that I gave at the +2006 EuroPython conference~\cite{web:europython}. I used as my data +set the Linux 2.6.17-mm1 patch series, which consists of 1,738 +patches. I applied these on top of a Linux kernel repository +containing all 27,472 revisions between Linux 2.6.12-rc2 and Linux +2.6.17. + +On my old, slow laptop, I was able to +\hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-a}} all 1,738 patches in 3.5 minutes, +and \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}} them all in 30 seconds. (On a +newer laptop, the time to push all patches dropped to two minutes.) I +could \hgxcmd{mq}{qrefresh} one of the biggest patches (which made 22,779 +lines of changes to 287 files) in 6.6 seconds. + +Clearly, MQ is well suited to working in large trees, but there are a +few tricks you can use to get the best performance of it. + +First of all, try to ``batch'' operations together. Every time you +run \hgxcmd{mq}{qpush} or \hgxcmd{mq}{qpop}, these commands scan the working +directory once to make sure you haven't made some changes and then +forgotten to run \hgxcmd{mq}{qrefresh}. On a small tree, the time that +this scan takes is unnoticeable. However, on a medium-sized tree +(containing tens of thousands of files), it can take a second or more. + +The \hgxcmd{mq}{qpush} and \hgxcmd{mq}{qpop} commands allow you to push and pop +multiple patches at a time. You can identify the ``destination +patch'' that you want to end up at. When you \hgxcmd{mq}{qpush} with a +destination specified, it will push patches until that patch is at the +top of the applied stack. When you \hgxcmd{mq}{qpop} to a destination, MQ +will pop patches until the destination patch is at the top. + +You can identify a destination patch using either the name of the +patch, or by number. If you use numeric addressing, patches are +counted from zero; this means that the first patch is zero, the second +is one, and so on. + +\section{Updating your patches when the underlying code changes} +\label{sec:mq:merge} + +It's common to have a stack of patches on top of an underlying +repository that you don't modify directly. If you're working on +changes to third-party code, or on a feature that is taking longer to +develop than the rate of change of the code beneath, you will often +need to sync up with the underlying code, and fix up any hunks in your +patches that no longer apply. This is called \emph{rebasing} your +patch series. + +The simplest way to do this is to \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}} +your patches, then \hgcmd{pull} changes into the underlying +repository, and finally \hgcmdargs{qpush}{\hgxopt{mq}{qpop}{-a}} your +patches again. MQ will stop pushing any time it runs across a patch +that fails to apply during conflicts, allowing you to fix your +conflicts, \hgxcmd{mq}{qrefresh} the affected patch, and continue pushing +until you have fixed your entire stack. + +This approach is easy to use and works well if you don't expect +changes to the underlying code to affect how well your patches apply. +If your patch stack touches code that is modified frequently or +invasively in the underlying repository, however, fixing up rejected +hunks by hand quickly becomes tiresome. + +It's possible to partially automate the rebasing process. If your +patches apply cleanly against some revision of the underlying repo, MQ +can use this information to help you to resolve conflicts between your +patches and a different revision. + +The process is a little involved. +\begin{enumerate} +\item To begin, \hgcmdargs{qpush}{-a} all of your patches on top of + the revision where you know that they apply cleanly. +\item Save a backup copy of your patch directory using + \hgcmdargs{qsave}{\hgxopt{mq}{qsave}{-e} \hgxopt{mq}{qsave}{-c}}. This prints + the name of the directory that it has saved the patches in. It will + save the patches to a directory called + \sdirname{.hg/patches.\emph{N}}, where \texttt{\emph{N}} is a small + integer. It also commits a ``save changeset'' on top of your + applied patches; this is for internal book-keeping, and records the + states of the \sfilename{series} and \sfilename{status} files. +\item Use \hgcmd{pull} to bring new changes into the underlying + repository. (Don't run \hgcmdargs{pull}{-u}; see below for why.) +\item Update to the new tip revision, using + \hgcmdargs{update}{\hgopt{update}{-C}} to override the patches you + have pushed. +\item Merge all patches using \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-m} + \hgxopt{mq}{qpush}{-a}}. The \hgxopt{mq}{qpush}{-m} option to \hgxcmd{mq}{qpush} + tells MQ to perform a three-way merge if the patch fails to apply. +\end{enumerate} + +During the \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-m}}, each patch in the +\sfilename{series} file is applied normally. If a patch applies with +fuzz or rejects, MQ looks at the queue you \hgxcmd{mq}{qsave}d, and +performs a three-way merge with the corresponding changeset. This +merge uses Mercurial's normal merge machinery, so it may pop up a GUI +merge tool to help you to resolve problems. + +When you finish resolving the effects of a patch, MQ refreshes your +patch based on the result of the merge. + +At the end of this process, your repository will have one extra head +from the old patch queue, and a copy of the old patch queue will be in +\sdirname{.hg/patches.\emph{N}}. You can remove the extra head using +\hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a} \hgxopt{mq}{qpop}{-n} patches.\emph{N}} +or \hgcmd{strip}. You can delete \sdirname{.hg/patches.\emph{N}} once +you are sure that you no longer need it as a backup. + +\section{Identifying patches} + +MQ commands that work with patches let you refer to a patch either by +using its name or by a number. By name is obvious enough; pass the +name \filename{foo.patch} to \hgxcmd{mq}{qpush}, for example, and it will +push patches until \filename{foo.patch} is applied. + +As a shortcut, you can refer to a patch using both a name and a +numeric offset; \texttt{foo.patch-2} means ``two patches before +\texttt{foo.patch}'', while \texttt{bar.patch+4} means ``four patches +after \texttt{bar.patch}''. + +Referring to a patch by index isn't much different. The first patch +printed in the output of \hgxcmd{mq}{qseries} is patch zero (yes, it's one +of those start-at-zero counting systems); the second is patch one; and +so on. + +MQ also makes it easy to work with patches when you are using normal +Mercurial commands. Every command that accepts a changeset ID will +also accept the name of an applied patch. MQ augments the tags +normally in the repository with an eponymous one for each applied +patch. In addition, the special tags \index{tags!special tag + names!\texttt{qbase}}\texttt{qbase} and \index{tags!special tag + names!\texttt{qtip}}\texttt{qtip} identify the ``bottom-most'' and +topmost applied patches, respectively. + +These additions to Mercurial's normal tagging capabilities make +dealing with patches even more of a breeze. +\begin{itemize} +\item Want to patchbomb a mailing list with your latest series of + changes? + \begin{codesample4} + hg email qbase:qtip + \end{codesample4} + (Don't know what ``patchbombing'' is? See + section~\ref{sec:hgext:patchbomb}.) +\item Need to see all of the patches since \texttt{foo.patch} that + have touched files in a subdirectory of your tree? + \begin{codesample4} + hg log -r foo.patch:qtip \emph{subdir} + \end{codesample4} +\end{itemize} + +Because MQ makes the names of patches available to the rest of +Mercurial through its normal internal tag machinery, you don't need to +type in the entire name of a patch when you want to identify it by +name. + +\begin{figure}[ht] + \interaction{mq.id.output} + \caption{Using MQ's tag features to work with patches} + \label{ex:mq:id} +\end{figure} + +Another nice consequence of representing patch names as tags is that +when you run the \hgcmd{log} command, it will display a patch's name +as a tag, simply as part of its normal output. This makes it easy to +visually distinguish applied patches from underlying ``normal'' +revisions. Figure~\ref{ex:mq:id} shows a few normal Mercurial +commands in use with applied patches. + +\section{Useful things to know about} + +There are a number of aspects of MQ usage that don't fit tidily into +sections of their own, but that are good to know. Here they are, in +one place. + +\begin{itemize} +\item Normally, when you \hgxcmd{mq}{qpop} a patch and \hgxcmd{mq}{qpush} it + again, the changeset that represents the patch after the pop/push + will have a \emph{different identity} than the changeset that + represented the hash beforehand. See + section~\ref{sec:mqref:cmd:qpush} for information as to why this is. +\item It's not a good idea to \hgcmd{merge} changes from another + branch with a patch changeset, at least if you want to maintain the + ``patchiness'' of that changeset and changesets below it on the + patch stack. If you try to do this, it will appear to succeed, but + MQ will become confused. +\end{itemize} + +\section{Managing patches in a repository} +\label{sec:mq:repo} + +Because MQ's \sdirname{.hg/patches} directory resides outside a +Mercurial repository's working directory, the ``underlying'' Mercurial +repository knows nothing about the management or presence of patches. + +This presents the interesting possibility of managing the contents of +the patch directory as a Mercurial repository in its own right. This +can be a useful way to work. For example, you can work on a patch for +a while, \hgxcmd{mq}{qrefresh} it, then \hgcmd{commit} the current state of +the patch. This lets you ``roll back'' to that version of the patch +later on. + +You can then share different versions of the same patch stack among +multiple underlying repositories. I use this when I am developing a +Linux kernel feature. I have a pristine copy of my kernel sources for +each of several CPU architectures, and a cloned repository under each +that contains the patches I am working on. When I want to test a +change on a different architecture, I push my current patches to the +patch repository associated with that kernel tree, pop and push all of +my patches, and build and test that kernel. + +Managing patches in a repository makes it possible for multiple +developers to work on the same patch series without colliding with +each other, all on top of an underlying source base that they may or +may not control. + +\subsection{MQ support for patch repositories} + +MQ helps you to work with the \sdirname{.hg/patches} directory as a +repository; when you prepare a repository for working with patches +using \hgxcmd{mq}{qinit}, you can pass the \hgxopt{mq}{qinit}{-c} option to +create the \sdirname{.hg/patches} directory as a Mercurial repository. + +\begin{note} + If you forget to use the \hgxopt{mq}{qinit}{-c} option, you can simply go + into the \sdirname{.hg/patches} directory at any time and run + \hgcmd{init}. Don't forget to add an entry for the + \sfilename{status} file to the \sfilename{.hgignore} file, though + + (\hgcmdargs{qinit}{\hgxopt{mq}{qinit}{-c}} does this for you + automatically); you \emph{really} don't want to manage the + \sfilename{status} file. +\end{note} + +As a convenience, if MQ notices that the \dirname{.hg/patches} +directory is a repository, it will automatically \hgcmd{add} every +patch that you create and import. + +MQ provides a shortcut command, \hgxcmd{mq}{qcommit}, that runs +\hgcmd{commit} in the \sdirname{.hg/patches} directory. This saves +some bothersome typing. + +Finally, as a convenience to manage the patch directory, you can +define the alias \command{mq} on Unix systems. For example, on Linux +systems using the \command{bash} shell, you can include the following +snippet in your \tildefile{.bashrc}. + +\begin{codesample2} + alias mq=`hg -R \$(hg root)/.hg/patches' +\end{codesample2} + +You can then issue commands of the form \cmdargs{mq}{pull} from +the main repository. + +\subsection{A few things to watch out for} + +MQ's support for working with a repository full of patches is limited +in a few small respects. + +MQ cannot automatically detect changes that you make to the patch +directory. If you \hgcmd{pull}, manually edit, or \hgcmd{update} +changes to patches or the \sfilename{series} file, you will have to +\hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}} and then +\hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-a}} in the underlying repository to +see those changes show up there. If you forget to do this, you can +confuse MQ's idea of which patches are applied. + +\section{Third party tools for working with patches} +\label{sec:mq:tools} + +Once you've been working with patches for a while, you'll find +yourself hungry for tools that will help you to understand and +manipulate the patches you're dealing with. + +The \command{diffstat} command~\cite{web:diffstat} generates a +histogram of the modifications made to each file in a patch. It +provides a good way to ``get a sense of'' a patch---which files it +affects, and how much change it introduces to each file and as a +whole. (I find that it's a good idea to use \command{diffstat}'s +\cmdopt{diffstat}{-p} option as a matter of course, as otherwise it +will try to do clever things with prefixes of file names that +inevitably confuse at least me.) + +\begin{figure}[ht] + \interaction{mq.tools.tools} + \caption{The \command{diffstat}, \command{filterdiff}, and \command{lsdiff} commands} + \label{ex:mq:tools} +\end{figure} + +The \package{patchutils} package~\cite{web:patchutils} is invaluable. +It provides a set of small utilities that follow the ``Unix +philosophy;'' each does one useful thing with a patch. The +\package{patchutils} command I use most is \command{filterdiff}, which +extracts subsets from a patch file. For example, given a patch that +modifies hundreds of files across dozens of directories, a single +invocation of \command{filterdiff} can generate a smaller patch that +only touches files whose names match a particular glob pattern. See +section~\ref{mq-collab:tips:interdiff} for another example. + +\section{Good ways to work with patches} + +Whether you are working on a patch series to submit to a free software +or open source project, or a series that you intend to treat as a +sequence of regular changesets when you're done, you can use some +simple techniques to keep your work well organised. + +Give your patches descriptive names. A good name for a patch might be +\filename{rework-device-alloc.patch}, because it will immediately give +you a hint what the purpose of the patch is. Long names shouldn't be +a problem; you won't be typing the names often, but you \emph{will} be +running commands like \hgxcmd{mq}{qapplied} and \hgxcmd{mq}{qtop} over and over. +Good naming becomes especially important when you have a number of +patches to work with, or if you are juggling a number of different +tasks and your patches only get a fraction of your attention. + +Be aware of what patch you're working on. Use the \hgxcmd{mq}{qtop} +command and skim over the text of your patches frequently---for +example, using \hgcmdargs{tip}{\hgopt{tip}{-p}})---to be sure of where +you stand. I have several times worked on and \hgxcmd{mq}{qrefresh}ed a +patch other than the one I intended, and it's often tricky to migrate +changes into the right patch after making them in the wrong one. + +For this reason, it is very much worth investing a little time to +learn how to use some of the third-party tools I described in +section~\ref{sec:mq:tools}, particularly \command{diffstat} and +\command{filterdiff}. The former will give you a quick idea of what +changes your patch is making, while the latter makes it easy to splice +hunks selectively out of one patch and into another. + +\section{MQ cookbook} + +\subsection{Manage ``trivial'' patches} + +Because the overhead of dropping files into a new Mercurial repository +is so low, it makes a lot of sense to manage patches this way even if +you simply want to make a few changes to a source tarball that you +downloaded. + +Begin by downloading and unpacking the source tarball, +and turning it into a Mercurial repository. +\interaction{mq.tarball.download} + +Continue by creating a patch stack and making your changes. +\interaction{mq.tarball.qinit} + +Let's say a few weeks or months pass, and your package author releases +a new version. First, bring their changes into the repository. +\interaction{mq.tarball.newsource} +The pipeline starting with \hgcmd{locate} above deletes all files in +the working directory, so that \hgcmd{commit}'s +\hgopt{commit}{--addremove} option can actually tell which files have +really been removed in the newer version of the source. + +Finally, you can apply your patches on top of the new tree. +\interaction{mq.tarball.repush} + +\subsection{Combining entire patches} +\label{sec:mq:combine} + +MQ provides a command, \hgxcmd{mq}{qfold} that lets you combine entire +patches. This ``folds'' the patches you name, in the order you name +them, into the topmost applied patch, and concatenates their +descriptions onto the end of its description. The patches that you +fold must be unapplied before you fold them. + +The order in which you fold patches matters. If your topmost applied +patch is \texttt{foo}, and you \hgxcmd{mq}{qfold} \texttt{bar} and +\texttt{quux} into it, you will end up with a patch that has the same +effect as if you applied first \texttt{foo}, then \texttt{bar}, +followed by \texttt{quux}. + +\subsection{Merging part of one patch into another} + +Merging \emph{part} of one patch into another is more difficult than +combining entire patches. + +If you want to move changes to entire files, you can use +\command{filterdiff}'s \cmdopt{filterdiff}{-i} and +\cmdopt{filterdiff}{-x} options to choose the modifications to snip +out of one patch, concatenating its output onto the end of the patch +you want to merge into. You usually won't need to modify the patch +you've merged the changes from. Instead, MQ will report some rejected +hunks when you \hgxcmd{mq}{qpush} it (from the hunks you moved into the +other patch), and you can simply \hgxcmd{mq}{qrefresh} the patch to drop +the duplicate hunks. + +If you have a patch that has multiple hunks modifying a file, and you +only want to move a few of those hunks, the job becomes more messy, +but you can still partly automate it. Use \cmdargs{lsdiff}{-nvv} to +print some metadata about the patch. +\interaction{mq.tools.lsdiff} + +This command prints three different kinds of number: +\begin{itemize} +\item (in the first column) a \emph{file number} to identify each file + modified in the patch; +\item (on the next line, indented) the line number within a modified + file where a hunk starts; and +\item (on the same line) a \emph{hunk number} to identify that hunk. +\end{itemize} + +You'll have to use some visual inspection, and reading of the patch, +to identify the file and hunk numbers you'll want, but you can then +pass them to to \command{filterdiff}'s \cmdopt{filterdiff}{--files} +and \cmdopt{filterdiff}{--hunks} options, to select exactly the file +and hunk you want to extract. + +Once you have this hunk, you can concatenate it onto the end of your +destination patch and continue with the remainder of +section~\ref{sec:mq:combine}. + +\section{Differences between quilt and MQ} + +If you are already familiar with quilt, MQ provides a similar command +set. There are a few differences in the way that it works. + +You will already have noticed that most quilt commands have MQ +counterparts that simply begin with a ``\texttt{q}''. The exceptions +are quilt's \texttt{add} and \texttt{remove} commands, the +counterparts for which are the normal Mercurial \hgcmd{add} and +\hgcmd{remove} commands. There is no MQ equivalent of the quilt +\texttt{edit} command. + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 7f0af73f53ab -r 7e52f0cc4516 es/note.png Binary file es/note.png has changed diff -r 7f0af73f53ab -r 7e52f0cc4516 es/tour-merge.tex --- a/es/tour-merge.tex Sat Oct 18 14:35:43 2008 -0500 +++ b/es/tour-merge.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,283 @@ +\chapter{A tour of Mercurial: merging work} +\label{chap:tour-merge} + +We've now covered cloning a repository, making changes in a +repository, and pulling or pushing changes from one repository into +another. Our next step is \emph{merging} changes from separate +repositories. + +\section{Merging streams of work} + +Merging is a fundamental part of working with a distributed revision +control tool. +\begin{itemize} +\item Alice and Bob each have a personal copy of a repository for a + project they're collaborating on. Alice fixes a bug in her + repository; Bob adds a new feature in his. They want the shared + repository to contain both the bug fix and the new feature. +\item I frequently work on several different tasks for a single + project at once, each safely isolated in its own repository. + Working this way means that I often need to merge one piece of my + own work with another. +\end{itemize} + +Because merging is such a common thing to need to do, Mercurial makes +it easy. Let's walk through the process. We'll begin by cloning yet +another repository (see how often they spring up?) and making a change +in it. +\interaction{tour.merge.clone} +We should now have two copies of \filename{hello.c} with different +contents. The histories of the two repositories have also diverged, +as illustrated in figure~\ref{fig:tour-merge:sep-repos}. +\interaction{tour.merge.cat} + +\begin{figure}[ht] + \centering + \grafix{tour-merge-sep-repos} + \caption{Divergent recent histories of the \dirname{my-hello} and + \dirname{my-new-hello} repositories} + \label{fig:tour-merge:sep-repos} +\end{figure} + +We already know that pulling changes from our \dirname{my-hello} +repository will have no effect on the working directory. +\interaction{tour.merge.pull} +However, the \hgcmd{pull} command says something about ``heads''. + +\subsection{Head changesets} + +A head is a change that has no descendants, or children, as they're +also known. The tip revision is thus a head, because the newest +revision in a repository doesn't have any children, but a repository +can contain more than one head. + +\begin{figure}[ht] + \centering + \grafix{tour-merge-pull} + \caption{Repository contents after pulling from \dirname{my-hello} into + \dirname{my-new-hello}} + \label{fig:tour-merge:pull} +\end{figure} + +In figure~\ref{fig:tour-merge:pull}, you can see the effect of the +pull from \dirname{my-hello} into \dirname{my-new-hello}. The history +that was already present in \dirname{my-new-hello} is untouched, but a +new revision has been added. By referring to +figure~\ref{fig:tour-merge:sep-repos}, we can see that the +\emph{changeset ID} remains the same in the new repository, but the +\emph{revision number} has changed. (This, incidentally, is a fine +example of why it's not safe to use revision numbers when discussing +changesets.) We can view the heads in a repository using the +\hgcmd{heads} command. +\interaction{tour.merge.heads} + +\subsection{Performing the merge} + +What happens if we try to use the normal \hgcmd{update} command to +update to the new tip? +\interaction{tour.merge.update} +Mercurial is telling us that the \hgcmd{update} command won't do a +merge; it won't update the working directory when it thinks we might +be wanting to do a merge, unless we force it to do so. Instead, we +use the \hgcmd{merge} command to merge the two heads. +\interaction{tour.merge.merge} + +\begin{figure}[ht] + \centering + \grafix{tour-merge-merge} + \caption{Working directory and repository during merge, and + following commit} + \label{fig:tour-merge:merge} +\end{figure} + +This updates the working directory so that it contains changes from +\emph{both} heads, which is reflected in both the output of +\hgcmd{parents} and the contents of \filename{hello.c}. +\interaction{tour.merge.parents} + +\subsection{Committing the results of the merge} + +Whenever we've done a merge, \hgcmd{parents} will display two parents +until we \hgcmd{commit} the results of the merge. +\interaction{tour.merge.commit} +We now have a new tip revision; notice that it has \emph{both} of +our former heads as its parents. These are the same revisions that +were previously displayed by \hgcmd{parents}. +\interaction{tour.merge.tip} +In figure~\ref{fig:tour-merge:merge}, you can see a representation of +what happens to the working directory during the merge, and how this +affects the repository when the commit happens. During the merge, the +working directory has two parent changesets, and these become the +parents of the new changeset. + +\section{Merging conflicting changes} + +Most merges are simple affairs, but sometimes you'll find yourself +merging changes where each modifies the same portions of the same +files. Unless both modifications are identical, this results in a +\emph{conflict}, where you have to decide how to reconcile the +different changes into something coherent. + +\begin{figure}[ht] + \centering + \grafix{tour-merge-conflict} + \caption{Conflicting changes to a document} + \label{fig:tour-merge:conflict} +\end{figure} + +Figure~\ref{fig:tour-merge:conflict} illustrates an instance of two +conflicting changes to a document. We started with a single version +of the file; then we made some changes; while someone else made +different changes to the same text. Our task in resolving the +conflicting changes is to decide what the file should look like. + +Mercurial doesn't have a built-in facility for handling conflicts. +Instead, it runs an external program called \command{hgmerge}. This +is a shell script that is bundled with Mercurial; you can change it to +behave however you please. What it does by default is try to find one +of several different merging tools that are likely to be installed on +your system. It first tries a few fully automatic merging tools; if +these don't succeed (because the resolution process requires human +guidance) or aren't present, the script tries a few different +graphical merging tools. + +It's also possible to get Mercurial to run another program or script +instead of \command{hgmerge}, by setting the \envar{HGMERGE} +environment variable to the name of your preferred program. + +\subsection{Using a graphical merge tool} + +My preferred graphical merge tool is \command{kdiff3}, which I'll use +to describe the features that are common to graphical file merging +tools. You can see a screenshot of \command{kdiff3} in action in +figure~\ref{fig:tour-merge:kdiff3}. The kind of merge it is +performing is called a \emph{three-way merge}, because there are three +different versions of the file of interest to us. The tool thus +splits the upper portion of the window into three panes: +\begin{itemize} +\item At the left is the \emph{base} version of the file, i.e.~the + most recent version from which the two versions we're trying to + merge are descended. +\item In the middle is ``our'' version of the file, with the contents + that we modified. +\item On the right is ``their'' version of the file, the one that + from the changeset that we're trying to merge with. +\end{itemize} +In the pane below these is the current \emph{result} of the merge. +Our task is to replace all of the red text, which indicates unresolved +conflicts, with some sensible merger of the ``ours'' and ``theirs'' +versions of the file. + +All four of these panes are \emph{locked together}; if we scroll +vertically or horizontally in any of them, the others are updated to +display the corresponding sections of their respective files. + +\begin{figure}[ht] + \centering + \grafix{kdiff3} + \caption{Using \command{kdiff3} to merge versions of a file} + \label{fig:tour-merge:kdiff3} +\end{figure} + +For each conflicting portion of the file, we can choose to resolve +the conflict using some combination of text from the base version, +ours, or theirs. We can also manually edit the merged file at any +time, in case we need to make further modifications. + +There are \emph{many} file merging tools available, too many to cover +here. They vary in which platforms they are available for, and in +their particular strengths and weaknesses. Most are tuned for merging +files containing plain text, while a few are aimed at specialised file +formats (generally XML). + +\subsection{A worked example} + +In this example, we will reproduce the file modification history of +figure~\ref{fig:tour-merge:conflict} above. Let's begin by creating a +repository with a base version of our document. +\interaction{tour-merge-conflict.wife} +We'll clone the repository and make a change to the file. +\interaction{tour-merge-conflict.cousin} +And another clone, to simulate someone else making a change to the +file. (This hints at the idea that it's not all that unusual to merge +with yourself when you isolate tasks in separate repositories, and +indeed to find and resolve conflicts while doing so.) +\interaction{tour-merge-conflict.son} +Having created two different versions of the file, we'll set up an +environment suitable for running our merge. +\interaction{tour-merge-conflict.pull} + +In this example, I won't use Mercurial's normal \command{hgmerge} +program to do the merge, because it would drop my nice automated +example-running tool into a graphical user interface. Instead, I'll +set \envar{HGMERGE} to tell Mercurial to use the non-interactive +\command{merge} command. This is bundled with many Unix-like systems. +If you're following this example on your computer, don't bother +setting \envar{HGMERGE}. +\interaction{tour-merge-conflict.merge} +Because \command{merge} can't resolve the conflicting changes, it +leaves \emph{merge markers} inside the file that has conflicts, +indicating which lines have conflicts, and whether they came from our +version of the file or theirs. + +Mercurial can tell from the way \command{merge} exits that it wasn't +able to merge successfully, so it tells us what commands we'll need to +run if we want to redo the merging operation. This could be useful +if, for example, we were running a graphical merge tool and quit +because we were confused or realised we had made a mistake. + +If automatic or manual merges fail, there's nothing to prevent us from +``fixing up'' the affected files ourselves, and committing the results +of our merge: +\interaction{tour-merge-conflict.commit} + +\section{Simplifying the pull-merge-commit sequence} +\label{sec:tour-merge:fetch} + +The process of merging changes as outlined above is straightforward, +but requires running three commands in sequence. +\begin{codesample2} + hg pull + hg merge + hg commit -m 'Merged remote changes' +\end{codesample2} +In the case of the final commit, you also need to enter a commit +message, which is almost always going to be a piece of uninteresting +``boilerplate'' text. + +It would be nice to reduce the number of steps needed, if this were +possible. Indeed, Mercurial is distributed with an extension called +\hgext{fetch} that does just this. + +Mercurial provides a flexible extension mechanism that lets people +extend its functionality, while keeping the core of Mercurial small +and easy to deal with. Some extensions add new commands that you can +use from the command line, while others work ``behind the scenes,'' +for example adding capabilities to the server. + +The \hgext{fetch} extension adds a new command called, not +surprisingly, \hgcmd{fetch}. This extension acts as a combination of +\hgcmd{pull}, \hgcmd{update} and \hgcmd{merge}. It begins by pulling +changes from another repository into the current repository. If it +finds that the changes added a new head to the repository, it begins a +merge, then commits the result of the merge with an +automatically-generated commit message. If no new heads were added, +it updates the working directory to the new tip changeset. + +Enabling the \hgext{fetch} extension is easy. Edit your +\sfilename{.hgrc}, and either go to the \rcsection{extensions} section +or create an \rcsection{extensions} section. Then add a line that +simply reads ``\Verb+fetch +''. +\begin{codesample2} + [extensions] + fetch = +\end{codesample2} +(Normally, on the right-hand side of the ``\texttt{=}'' would appear +the location of the extension, but since the \hgext{fetch} extension +is in the standard distribution, Mercurial knows where to search for +it.) + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: diff -r 7f0af73f53ab -r 7e52f0cc4516 es/undo-manual-merge.dot --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/es/undo-manual-merge.dot Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,8 @@ +digraph undo_manual { + "first change" -> "second change"; + "second change" -> "third change"; + backout [label="back out\nsecond change", shape=box]; + "second change" -> backout; + "third change" -> "manual\nmerge"; + backout -> "manual\nmerge"; +} diff -r 7f0af73f53ab -r 7e52f0cc4516 es/undo-non-tip.dot --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/es/undo-non-tip.dot Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,9 @@ +digraph undo_non_tip { + "first change" -> "second change"; + "second change" -> "third change"; + backout [label="back out\nsecond change", shape=box]; + "second change" -> backout; + merge [label="automated\nmerge", shape=box]; + "third change" -> merge; + backout -> merge; +}