hgbook: en/ch12-mq.tex comparison

comparison en/ch12-mq.tex @ 649:5cd47f721686

Rename LaTeX input files to have numeric prefixes

author	Bryan O'Sullivan <bos@serpentine.com>
date	Thu, 29 Jan 2009 22:56:27 -0800
parents	en/mq.tex@97e929385442
children

comparison

equal deleted inserted replaced

-:bc14f94e726a
+:5cd47f721686
+\chapter{Managing change with Mercurial Queues}
+\label{chap:mq}
+\section{The patch management problem}
+\label{sec:mq:patch-mgmt}
+Here is a common scenario: you need to install a software package from
+source, but you find a bug that you must fix in the source before you
+can start using the package.  You make your changes, forget about the
+package for a while, and a few months later you need to upgrade to a
+newer version of the package.  If the newer version of the package
+still has the bug, you must extract your fix from the older source
+tree and apply it against the newer version.  This is a tedious task,
+and it's easy to make mistakes.
+This is a simple case of the ``patch management'' problem.  You have
+an ``upstream'' source tree that you can't change; you need to make
+some local changes on top of the upstream tree; and you'd like to be
+able to keep those changes separate, so that you can apply them to
+newer versions of the upstream source.
+The patch management problem arises in many situations.  Probably the
+most visible is that a user of an open source software project will
+contribute a bug fix or new feature to the project's maintainers in the
+form of a patch.
+Distributors of operating systems that include open source software
+often need to make changes to the packages they distribute so that
+they will build properly in their environments.
+When you have few changes to maintain, it is easy to manage a single
+patch using the standard \command{diff} and \command{patch} programs
+(see section~\ref{sec:mq:patch} for a discussion of these tools).
+Once the number of changes grows, it starts to make sense to maintain
+patches as discrete ``chunks of work,'' so that for example a single
+patch will contain only one bug fix (the patch might modify several
+files, but it's doing ``only one thing''), and you may have a number
+of such patches for different bugs you need fixed and local changes
+you require.  In this situation, if you submit a bug fix patch to the
+upstream maintainers of a package and they include your fix in a
+subsequent release, you can simply drop that single patch when you're
+updating to the newer release.
+Maintaining a single patch against an upstream tree is a little
+tedious and error-prone, but not difficult.  However, the complexity
+of the problem grows rapidly as the number of patches you have to
+maintain increases.  With more than a tiny number of patches in hand,
+understanding which ones you have applied and maintaining them moves
+from messy to overwhelming.
+Fortunately, Mercurial includes a powerful extension, Mercurial Queues
+(or simply ``MQ''), that massively simplifies the patch management
+problem.
+\section{The prehistory of Mercurial Queues}
+\label{sec:mq:history}
+During the late 1990s, several Linux kernel developers started to
+maintain ``patch series'' that modified the behaviour of the Linux
+kernel.  Some of these series were focused on stability, some on
+feature coverage, and others were more speculative.
+The sizes of these patch series grew rapidly.  In 2002, Andrew Morton
+published some shell scripts he had been using to automate the task of
+managing his patch queues.  Andrew was successfully using these
+scripts to manage hundreds (sometimes thousands) of patches on top of
+the Linux kernel.
+\subsection{A patchwork quilt}
+\label{sec:mq:quilt}
+In early 2003, Andreas Gruenbacher and Martin Quinson borrowed the
+approach of Andrew's scripts and published a tool called ``patchwork
+quilt''~\cite{web:quilt}, or simply ``quilt''
+(see~\cite{gruenbacher:2005} for a paper describing it).  Because
+quilt substantially automated patch management, it rapidly gained a
+large following among open source software developers.
+Quilt manages a \emph{stack of patches} on top of a directory tree.
+To begin, you tell quilt to manage a directory tree, and tell it which
+files you want to manage; it stores away the names and contents of
+those files.  To fix a bug, you create a new patch (using a single
+command), edit the files you need to fix, then ``refresh'' the patch.
+The refresh step causes quilt to scan the directory tree; it updates
+the patch with all of the changes you have made.  You can create
+another patch on top of the first, which will track the changes
+required to modify the tree from ``tree with one patch applied'' to
+``tree with two patches applied''.
+You can \emph{change} which patches are applied to the tree.  If you
+``pop'' a patch, the changes made by that patch will vanish from the
+directory tree.  Quilt remembers which patches you have popped,
+though, so you can ``push'' a popped patch again, and the directory
+tree will be restored to contain the modifications in the patch.  Most
+importantly, you can run the ``refresh'' command at any time, and the
+topmost applied patch will be updated.  This means that you can, at
+any time, change both which patches are applied and what
+modifications those patches make.
+Quilt knows nothing about revision control tools, so it works equally
+well on top of an unpacked tarball or a Subversion working copy.
+\subsection{From patchwork quilt to Mercurial Queues}
+\label{sec:mq:quilt-mq}
+In mid-2005, Chris Mason took the features of quilt and wrote an
+extension that he called Mercurial Queues, which added quilt-like
+behaviour to Mercurial.
+The key difference between quilt and MQ is that quilt knows nothing
+about revision control systems, while MQ is \emph{integrated} into
+Mercurial.  Each patch that you push is represented as a Mercurial
+changeset.  Pop a patch, and the changeset goes away.
+Because quilt does not care about revision control tools, it is still
+a tremendously useful piece of software to know about for situations
+where you cannot use Mercurial and MQ.
+\section{The huge advantage of MQ}
+I cannot overstate the value that MQ offers through the unification of
+patches and revision control.
+A major reason that patches have persisted in the free software and
+open source world---in spite of the availability of increasingly
+capable revision control tools over the years---is the \emph{agility}
+they offer.
+Traditional revision control tools make a permanent, irreversible
+record of everything that you do.  While this has great value, it's
+also somewhat stifling.  If you want to perform a wild-eyed
+experiment, you have to be careful in how you go about it, or you risk
+leaving unneeded---or worse, misleading or destabilising---traces of
+your missteps and errors in the permanent revision record.
+By contrast, MQ's marriage of distributed revision control with
+patches makes it much easier to isolate your work.  Your patches live
+on top of normal revision history, and you can make them disappear or
+reappear at will.  If you don't like a patch, you can drop it.  If a
+patch isn't quite as you want it to be, simply fix it---as many times
+as you need to, until you have refined it into the form you desire.
+As an example, the integration of patches with revision control makes
+understanding patches and debugging their effects---and their
+interplay with the code they're based on---\emph{enormously} easier.
+Since every applied patch has an associated changeset, you can use
+\hgcmdargs{log}{\emph{filename}} to see which changesets and patches
+affected a file.  You can use the \hgext{bisect} command to
+binary-search through all changesets and applied patches to see where
+a bug got introduced or fixed.  You can use the \hgcmd{annotate}
+command to see which changeset or patch modified a particular line of
+a source file.  And so on.
+\section{Understanding patches}
+\label{sec:mq:patch}
+Because MQ doesn't hide its patch-oriented nature, it is helpful to
+understand what patches are, and a little about the tools that work
+with them.
+The traditional Unix \command{diff} command compares two files, and
+prints a list of differences between them. The \command{patch} command
+understands these differences as \emph{modifications} to make to a
+file.  Take a look at figure~\ref{ex:mq:diff} for a simple example of
+these commands in action.
+\begin{figure}[ht]
+\interaction{mq.dodiff.diff}
+\caption{Simple uses of the \command{diff} and \command{patch} commands}
+\label{ex:mq:diff}
+\end{figure}
+The type of file that \command{diff} generates (and \command{patch}
+takes as input) is called a ``patch'' or a ``diff''; there is no
+difference between a patch and a diff.  (We'll use the term ``patch'',
+since it's more commonly used.)
+A patch file can start with arbitrary text; the \command{patch}
+command ignores this text, but MQ uses it as the commit message when
+creating changesets.  To find the beginning of the patch content,
+\command{patch} searches for the first line that starts with the
+string ``\texttt{diff~-}''.
+MQ works with \emph{unified} diffs (\command{patch} can accept several
+other diff formats, but MQ doesn't).  A unified diff contains two
+kinds of header.  The \emph{file header} describes the file being
+modified; it contains the name of the file to modify.  When
+\command{patch} sees a new file header, it looks for a file with that
+name to start modifying.
+After the file header comes a series of \emph{hunks}.  Each hunk
+starts with a header; this identifies the range of line numbers within
+the file that the hunk should modify.  Following the header, a hunk
+starts and ends with a few (usually three) lines of text from the
+unmodified file; these are called the \emph{context} for the hunk.  If
+there's only a small amount of context between successive hunks,
+\command{diff} doesn't print a new hunk header; it just runs the hunks
+together, with a few lines of context between modifications.
+Each line of context begins with a space character.  Within the hunk,
+a line that begins with ``\texttt{-}'' means ``remove this line,''
+while a line that begins with ``\texttt{+}'' means ``insert this
+line.''  For example, a line that is modified is represented by one
+deletion and one insertion.
+We will return to some of the more subtle aspects of patches later (in
+section~\ref{sec:mq:adv-patch}), but you should have enough information
+now to use MQ.
+\section{Getting started with Mercurial Queues}
+\label{sec:mq:start}
+Because MQ is implemented as an extension, you must explicitly enable
+before you can use it.  (You don't need to download anything; MQ ships
+with the standard Mercurial distribution.)  To enable MQ, edit your
+\tildefile{.hgrc} file, and add the lines in figure~\ref{ex:mq:config}.
+\begin{figure}[ht]
+\begin{codesample4}
+[extensions]
+hgext.mq =
+\end{codesample4}
+\label{ex:mq:config}
+\caption{Contents to add to \tildefile{.hgrc} to enable the MQ extension}
+\end{figure}
+Once the extension is enabled, it will make a number of new commands
+available.  To verify that the extension is working, you can use
+\hgcmd{help} to see if the \hgxcmd{mq}{qinit} command is now available; see
+the example in figure~\ref{ex:mq:enabled}.
+\begin{figure}[ht]
+\interaction{mq.qinit-help.help}
+\caption{How to verify that MQ is enabled}
+\label{ex:mq:enabled}
+\end{figure}
+You can use MQ with \emph{any} Mercurial repository, and its commands
+only operate within that repository.  To get started, simply prepare
+the repository using the \hgxcmd{mq}{qinit} command (see
+figure~\ref{ex:mq:qinit}).  This command creates an empty directory
+called \sdirname{.hg/patches}, where MQ will keep its metadata.  As
+with many Mercurial commands, the \hgxcmd{mq}{qinit} command prints nothing
+if it succeeds.
+\begin{figure}[ht]
+\interaction{mq.tutorial.qinit}
+\caption{Preparing a repository for use with MQ}
+\label{ex:mq:qinit}
+\end{figure}
+\begin{figure}[ht]
+\interaction{mq.tutorial.qnew}
+\caption{Creating a new patch}
+\label{ex:mq:qnew}
+\end{figure}
+\subsection{Creating a new patch}
+To begin work on a new patch, use the \hgxcmd{mq}{qnew} command.  This
+command takes one argument, the name of the patch to create.  MQ will
+use this as the name of an actual file in the \sdirname{.hg/patches}
+directory, as you can see in figure~\ref{ex:mq:qnew}.
+Also newly present in the \sdirname{.hg/patches} directory are two
+other files, \sfilename{series} and \sfilename{status}.  The
+\sfilename{series} file lists all of the patches that MQ knows about
+for this repository, with one patch per line.  Mercurial uses the
+\sfilename{status} file for internal book-keeping; it tracks all of the
+patches that MQ has \emph{applied} in this repository.
+\begin{note}
+You may sometimes want to edit the \sfilename{series} file by hand;
+for example, to change the sequence in which some patches are
+applied.  However, manually editing the \sfilename{status} file is
+almost always a bad idea, as it's easy to corrupt MQ's idea of what
+is happening.
+\end{note}
+Once you have created your new patch, you can edit files in the
+working directory as you usually would.  All of the normal Mercurial
+commands, such as \hgcmd{diff} and \hgcmd{annotate}, work exactly as
+they did before.
+\subsection{Refreshing a patch}
+When you reach a point where you want to save your work, use the
+\hgxcmd{mq}{qrefresh} command (figure~\ref{ex:mq:qnew}) to update the patch
+you are working on.  This command folds the changes you have made in
+the working directory into your patch, and updates its corresponding
+changeset to contain those changes.
+\begin{figure}[ht]
+\interaction{mq.tutorial.qrefresh}
+\caption{Refreshing a patch}
+\label{ex:mq:qrefresh}
+\end{figure}
+You can run \hgxcmd{mq}{qrefresh} as often as you like, so it's a good way
+to ``checkpoint'' your work.  Refresh your patch at an opportune
+time; try an experiment; and if the experiment doesn't work out,
+\hgcmd{revert} your modifications back to the last time you refreshed.
+\begin{figure}[ht]
+\interaction{mq.tutorial.qrefresh2}
+\caption{Refresh a patch many times to accumulate changes}
+\label{ex:mq:qrefresh2}
+\end{figure}
+\subsection{Stacking and tracking patches}
+Once you have finished working on a patch, or need to work on another,
+you can use the \hgxcmd{mq}{qnew} command again to create a new patch.
+Mercurial will apply this patch on top of your existing patch.  See
+figure~\ref{ex:mq:qnew2} for an example.  Notice that the patch
+contains the changes in our prior patch as part of its context (you
+can see this more clearly in the output of \hgcmd{annotate}).
+\begin{figure}[ht]
+\interaction{mq.tutorial.qnew2}
+\caption{Stacking a second patch on top of the first}
+\label{ex:mq:qnew2}
+\end{figure}
+So far, with the exception of \hgxcmd{mq}{qnew} and \hgxcmd{mq}{qrefresh}, we've
+been careful to only use regular Mercurial commands.  However, MQ
+provides many commands that are easier to use when you are thinking
+about patches, as illustrated in figure~\ref{ex:mq:qseries}:
+\begin{itemize}
+\item The \hgxcmd{mq}{qseries} command lists every patch that MQ knows
+about in this repository, from oldest to newest (most recently
+\emph{created}).
+\item The \hgxcmd{mq}{qapplied} command lists every patch that MQ has
+\emph{applied} in this repository, again from oldest to newest (most
+recently applied).
+\end{itemize}
+\begin{figure}[ht]
+\interaction{mq.tutorial.qseries}
+\caption{Understanding the patch stack with \hgxcmd{mq}{qseries} and
+\hgxcmd{mq}{qapplied}}
+\label{ex:mq:qseries}
+\end{figure}
+\subsection{Manipulating the patch stack}
+The previous discussion implied that there must be a difference
+between ``known'' and ``applied'' patches, and there is.  MQ can
+manage a patch without it being applied in the repository.
+An \emph{applied} patch has a corresponding changeset in the
+repository, and the effects of the patch and changeset are visible in
+the working directory.  You can undo the application of a patch using
+the \hgxcmd{mq}{qpop} command.  MQ still \emph{knows about}, or manages, a
+popped patch, but the patch no longer has a corresponding changeset in
+the repository, and the working directory does not contain the changes
+made by the patch.  Figure~\ref{fig:mq:stack} illustrates the
+difference between applied and tracked patches.
+\begin{figure}[ht]
+\centering
+\grafix{mq-stack}
+\caption{Applied and unapplied patches in the MQ patch stack}
+\label{fig:mq:stack}
+\end{figure}
+You can reapply an unapplied, or popped, patch using the \hgxcmd{mq}{qpush}
+command.  This creates a new changeset to correspond to the patch, and
+the patch's changes once again become present in the working
+directory.  See figure~\ref{ex:mq:qpop} for examples of \hgxcmd{mq}{qpop}
+and \hgxcmd{mq}{qpush} in action.  Notice that once we have popped a patch
+or two patches, the output of \hgxcmd{mq}{qseries} remains the same, while
+that of \hgxcmd{mq}{qapplied} has changed.
+\begin{figure}[ht]
+\interaction{mq.tutorial.qpop}
+\caption{Modifying the stack of applied patches}
+\label{ex:mq:qpop}
+\end{figure}
+\subsection{Pushing and popping many patches}
+While \hgxcmd{mq}{qpush} and \hgxcmd{mq}{qpop} each operate on a single patch at
+a time by default, you can push and pop many patches in one go.  The
+\hgxopt{mq}{qpush}{-a} option to \hgxcmd{mq}{qpush} causes it to push all
+unapplied patches, while the \hgxopt{mq}{qpop}{-a} option to \hgxcmd{mq}{qpop}
+causes it to pop all applied patches.  (For some more ways to push and
+pop many patches, see section~\ref{sec:mq:perf} below.)
+\begin{figure}[ht]
+\interaction{mq.tutorial.qpush-a}
+\caption{Pushing all unapplied patches}
+\label{ex:mq:qpush-a}
+\end{figure}
+\subsection{Safety checks, and overriding them}
+Several MQ commands check the working directory before they do
+anything, and fail if they find any modifications.  They do this to
+ensure that you won't lose any changes that you have made, but not yet
+incorporated into a patch.  Figure~\ref{ex:mq:add} illustrates this;
+the \hgxcmd{mq}{qnew} command will not create a new patch if there are
+outstanding changes, caused in this case by the \hgcmd{add} of
+\filename{file3}.
+\begin{figure}[ht]
+\interaction{mq.tutorial.add}
+\caption{Forcibly creating a patch}
+\label{ex:mq:add}
+\end{figure}
+Commands that check the working directory all take an ``I know what
+I'm doing'' option, which is always named \option{-f}.  The exact
+meaning of \option{-f} depends on the command.  For example,
+\hgcmdargs{qnew}{\hgxopt{mq}{qnew}{-f}} will incorporate any outstanding
+changes into the new patch it creates, but
+\hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-f}} will revert modifications to any
+files affected by the patch that it is popping.  Be sure to read the
+documentation for a command's \option{-f} option before you use it!
+\subsection{Working on several patches at once}
+The \hgxcmd{mq}{qrefresh} command always refreshes the \emph{topmost}
+applied patch.  This means that you can suspend work on one patch (by
+refreshing it), pop or push to make a different patch the top, and
+work on \emph{that} patch for a while.
+Here's an example that illustrates how you can use this ability.
+Let's say you're developing a new feature as two patches.  The first
+is a change to the core of your software, and the second---layered on
+top of the first---changes the user interface to use the code you just
+added to the core.  If you notice a bug in the core while you're
+working on the UI patch, it's easy to fix the core.  Simply
+\hgxcmd{mq}{qrefresh} the UI patch to save your in-progress changes, and
+\hgxcmd{mq}{qpop} down to the core patch.  Fix the core bug,
+\hgxcmd{mq}{qrefresh} the core patch, and \hgxcmd{mq}{qpush} back to the UI
+patch to continue where you left off.
+\section{More about patches}
+\label{sec:mq:adv-patch}
+MQ uses the GNU \command{patch} command to apply patches, so it's
+helpful to know a few more detailed aspects of how \command{patch}
+works, and about patches themselves.
+\subsection{The strip count}
+If you look at the file headers in a patch, you will notice that the
+pathnames usually have an extra component on the front that isn't
+present in the actual path name.  This is a holdover from the way that
+people used to generate patches (people still do this, but it's
+somewhat rare with modern revision control tools).
+Alice would unpack a tarball, edit her files, then decide that she
+wanted to create a patch.  So she'd rename her working directory,
+unpack the tarball again (hence the need for the rename), and use the
+\cmdopt{diff}{-r} and \cmdopt{diff}{-N} options to \command{diff} to
+recursively generate a patch between the unmodified directory and the
+modified one.  The result would be that the name of the unmodified
+directory would be at the front of the left-hand path in every file
+header, and the name of the modified directory would be at the front
+of the right-hand path.
+Since someone receiving a patch from the Alices of the net would be
+unlikely to have unmodified and modified directories with exactly the
+same names, the \command{patch} command has a \cmdopt{patch}{-p}
+option that indicates the number of leading path name components to
+strip when trying to apply a patch.  This number is called the
+\emph{strip count}.
+An option of ``\texttt{-p1}'' means ``use a strip count of one''.  If
+\command{patch} sees a file name \filename{foo/bar/baz} in a file
+header, it will strip \filename{foo} and try to patch a file named
+\filename{bar/baz}.  (Strictly speaking, the strip count refers to the
+number of \emph{path separators} (and the components that go with them
+) to strip.  A strip count of one will turn \filename{foo/bar} into
+\filename{bar}, but \filename{/foo/bar} (notice the extra leading
+slash) into \filename{foo/bar}.)
+The ``standard'' strip count for patches is one; almost all patches
+contain one leading path name component that needs to be stripped.
+Mercurial's \hgcmd{diff} command generates path names in this form,
+and the \hgcmd{import} command and MQ expect patches to have a strip
+count of one.
+If you receive a patch from someone that you want to add to your patch
+queue, and the patch needs a strip count other than one, you cannot
+just \hgxcmd{mq}{qimport} the patch, because \hgxcmd{mq}{qimport} does not yet
+have a \texttt{-p} option (see~\bug{311}).  Your best bet is to
+\hgxcmd{mq}{qnew} a patch of your own, then use \cmdargs{patch}{-p\emph{N}}
+to apply their patch, followed by \hgcmd{addremove} to pick up any
+files added or removed by the patch, followed by \hgxcmd{mq}{qrefresh}.
+This complexity may become unnecessary; see~\bug{311} for details.
+\subsection{Strategies for applying a patch}
+When \command{patch} applies a hunk, it tries a handful of
+successively less accurate strategies to try to make the hunk apply.
+This falling-back technique often makes it possible to take a patch
+that was generated against an old version of a file, and apply it
+against a newer version of that file.
+First, \command{patch} tries an exact match, where the line numbers,
+the context, and the text to be modified must apply exactly.  If it
+cannot make an exact match, it tries to find an exact match for the
+context, without honouring the line numbering information.  If this
+succeeds, it prints a line of output saying that the hunk was applied,
+but at some \emph{offset} from the original line number.
+If a context-only match fails, \command{patch} removes the first and
+last lines of the context, and tries a \emph{reduced} context-only
+match.  If the hunk with reduced context succeeds, it prints a message
+saying that it applied the hunk with a \emph{fuzz factor} (the number
+after the fuzz factor indicates how many lines of context
+\command{patch} had to trim before the patch applied).
+When neither of these techniques works, \command{patch} prints a
+message saying that the hunk in question was rejected.  It saves
+rejected hunks (also simply called ``rejects'') to a file with the
+same name, and an added \sfilename{.rej} extension.  It also saves an
+unmodified copy of the file with a \sfilename{.orig} extension; the
+copy of the file without any extensions will contain any changes made
+by hunks that \emph{did} apply cleanly.  If you have a patch that
+modifies \filename{foo} with six hunks, and one of them fails to
+apply, you will have: an unmodified \filename{foo.orig}, a
+\filename{foo.rej} containing one hunk, and \filename{foo}, containing
+the changes made by the five successful hunks.
+\subsection{Some quirks of patch representation}
+There are a few useful things to know about how \command{patch} works
+with files.
+\begin{itemize}
+\item This should already be obvious, but \command{patch} cannot
+handle binary files.
+\item Neither does it care about the executable bit; it creates new
+files as readable, but not executable.
+\item \command{patch} treats the removal of a file as a diff between
+the file to be removed and the empty file.  So your idea of ``I
+deleted this file'' looks like ``every line of this file was
+deleted'' in a patch.
+\item It treats the addition of a file as a diff between the empty
+file and the file to be added.  So in a patch, your idea of ``I
+added this file'' looks like ``every line of this file was added''.
+\item It treats a renamed file as the removal of the old name, and the
+addition of the new name.  This means that renamed files have a big
+footprint in patches.  (Note also that Mercurial does not currently
+try to infer when files have been renamed or copied in a patch.)
+\item \command{patch} cannot represent empty files, so you cannot use
+a patch to represent the notion ``I added this empty file to the
+tree''.
+\end{itemize}
+\subsection{Beware the fuzz}
+While applying a hunk at an offset, or with a fuzz factor, will often
+be completely successful, these inexact techniques naturally leave
+open the possibility of corrupting the patched file.  The most common
+cases typically involve applying a patch twice, or at an incorrect
+location in the file.  If \command{patch} or \hgxcmd{mq}{qpush} ever
+mentions an offset or fuzz factor, you should make sure that the
+modified files are correct afterwards.
+It's often a good idea to refresh a patch that has applied with an
+offset or fuzz factor; refreshing the patch generates new context
+information that will make it apply cleanly.  I say ``often,'' not
+``always,'' because sometimes refreshing a patch will make it fail to
+apply against a different revision of the underlying files.  In some
+cases, such as when you're maintaining a patch that must sit on top of
+multiple versions of a source tree, it's acceptable to have a patch
+apply with some fuzz, provided you've verified the results of the
+patching process in such cases.
+\subsection{Handling rejection}
+If \hgxcmd{mq}{qpush} fails to apply a patch, it will print an error
+message and exit.  If it has left \sfilename{.rej} files behind, it is
+usually best to fix up the rejected hunks before you push more patches
+or do any further work.
+If your patch \emph{used to} apply cleanly, and no longer does because
+you've changed the underlying code that your patches are based on,
+Mercurial Queues can help; see section~\ref{sec:mq:merge} for details.
+Unfortunately, there aren't any great techniques for dealing with
+rejected hunks.  Most often, you'll need to view the \sfilename{.rej}
+file and edit the target file, applying the rejected hunks by hand.
+If you're feeling adventurous, Neil Brown, a Linux kernel hacker,
+wrote a tool called \command{wiggle}~\cite{web:wiggle}, which is more
+vigorous than \command{patch} in its attempts to make a patch apply.
+Another Linux kernel hacker, Chris Mason (the author of Mercurial
+Queues), wrote a similar tool called
+\command{mpatch}~\cite{web:mpatch}, which takes a simple approach to
+automating the application of hunks rejected by \command{patch}.  The
+\command{mpatch} command can help with four common reasons that a hunk
+may be rejected:
+\begin{itemize}
+\item The context in the middle of a hunk has changed.
+\item A hunk is missing some context at the beginning or end.
+\item A large hunk might apply better---either entirely or in
+part---if it was broken up into smaller hunks.
+\item A hunk removes lines with slightly different content than those
+currently present in the file.
+\end{itemize}
+If you use \command{wiggle} or \command{mpatch}, you should be doubly
+careful to check your results when you're done.  In fact,
+\command{mpatch} enforces this method of double-checking the tool's
+output, by automatically dropping you into a merge program when it has
+done its job, so that you can verify its work and finish off any
+remaining merges.
+\section{Getting the best performance out of MQ}
+\label{sec:mq:perf}
+MQ is very efficient at handling a large number of patches.  I ran
+some performance experiments in mid-2006 for a talk that I gave at the
+2006 EuroPython conference~\cite{web:europython}.  I used as my data
+set the Linux 2.6.17-mm1 patch series, which consists of 1,738
+patches.  I applied these on top of a Linux kernel repository
+containing all 27,472 revisions between Linux 2.6.12-rc2 and Linux
+2.6.17.
+On my old, slow laptop, I was able to
+\hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-a}} all 1,738 patches in 3.5 minutes,
+and \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}} them all in 30 seconds.  (On a
+newer laptop, the time to push all patches dropped to two minutes.)  I
+could \hgxcmd{mq}{qrefresh} one of the biggest patches (which made 22,779
+lines of changes to 287 files) in 6.6 seconds.
+Clearly, MQ is well suited to working in large trees, but there are a
+few tricks you can use to get the best performance of it.
+First of all, try to ``batch'' operations together.  Every time you
+run \hgxcmd{mq}{qpush} or \hgxcmd{mq}{qpop}, these commands scan the working
+directory once to make sure you haven't made some changes and then
+forgotten to run \hgxcmd{mq}{qrefresh}.  On a small tree, the time that
+this scan takes is unnoticeable.  However, on a medium-sized tree
+(containing tens of thousands of files), it can take a second or more.
+The \hgxcmd{mq}{qpush} and \hgxcmd{mq}{qpop} commands allow you to push and pop
+multiple patches at a time.  You can identify the ``destination
+patch'' that you want to end up at.  When you \hgxcmd{mq}{qpush} with a
+destination specified, it will push patches until that patch is at the
+top of the applied stack.  When you \hgxcmd{mq}{qpop} to a destination, MQ
+will pop patches until the destination patch is at the top.
+You can identify a destination patch using either the name of the
+patch, or by number.  If you use numeric addressing, patches are
+counted from zero; this means that the first patch is zero, the second
+is one, and so on.
+\section{Updating your patches when the underlying code changes}
+\label{sec:mq:merge}
+It's common to have a stack of patches on top of an underlying
+repository that you don't modify directly.  If you're working on
+changes to third-party code, or on a feature that is taking longer to
+develop than the rate of change of the code beneath, you will often
+need to sync up with the underlying code, and fix up any hunks in your
+patches that no longer apply.  This is called \emph{rebasing} your
+patch series.
+The simplest way to do this is to \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}}
+your patches, then \hgcmd{pull} changes into the underlying
+repository, and finally \hgcmdargs{qpush}{\hgxopt{mq}{qpop}{-a}} your
+patches again.  MQ will stop pushing any time it runs across a patch
+that fails to apply during conflicts, allowing you to fix your
+conflicts, \hgxcmd{mq}{qrefresh} the affected patch, and continue pushing
+until you have fixed your entire stack.
+This approach is easy to use and works well if you don't expect
+changes to the underlying code to affect how well your patches apply.
+If your patch stack touches code that is modified frequently or
+invasively in the underlying repository, however, fixing up rejected
+hunks by hand quickly becomes tiresome.
+It's possible to partially automate the rebasing process.  If your
+patches apply cleanly against some revision of the underlying repo, MQ
+can use this information to help you to resolve conflicts between your
+patches and a different revision.
+The process is a little involved.
+\begin{enumerate}
+\item To begin, \hgcmdargs{qpush}{-a} all of your patches on top of
+the revision where you know that they apply cleanly.
+\item Save a backup copy of your patch directory using
+\hgcmdargs{qsave}{\hgxopt{mq}{qsave}{-e} \hgxopt{mq}{qsave}{-c}}.  This prints
+the name of the directory that it has saved the patches in.  It will
+save the patches to a directory called
+\sdirname{.hg/patches.\emph{N}}, where \texttt{\emph{N}} is a small
+integer.  It also commits a ``save changeset'' on top of your
+applied patches; this is for internal book-keeping, and records the
+states of the \sfilename{series} and \sfilename{status} files.
+\item Use \hgcmd{pull} to bring new changes into the underlying
+repository.  (Don't run \hgcmdargs{pull}{-u}; see below for why.)
+\item Update to the new tip revision, using
+\hgcmdargs{update}{\hgopt{update}{-C}} to override the patches you
+have pushed.
+\item Merge all patches using \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-m}
+\hgxopt{mq}{qpush}{-a}}.  The \hgxopt{mq}{qpush}{-m} option to \hgxcmd{mq}{qpush}
+tells MQ to perform a three-way merge if the patch fails to apply.
+\end{enumerate}
+During the \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-m}}, each patch in the
+\sfilename{series} file is applied normally.  If a patch applies with
+fuzz or rejects, MQ looks at the queue you \hgxcmd{mq}{qsave}d, and
+performs a three-way merge with the corresponding changeset.  This
+merge uses Mercurial's normal merge machinery, so it may pop up a GUI
+merge tool to help you to resolve problems.
+When you finish resolving the effects of a patch, MQ refreshes your
+patch based on the result of the merge.
+At the end of this process, your repository will have one extra head
+from the old patch queue, and a copy of the old patch queue will be in
+\sdirname{.hg/patches.\emph{N}}. You can remove the extra head using
+\hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a} \hgxopt{mq}{qpop}{-n} patches.\emph{N}}
+or \hgcmd{strip}.  You can delete \sdirname{.hg/patches.\emph{N}} once
+you are sure that you no longer need it as a backup.
+\section{Identifying patches}
+MQ commands that work with patches let you refer to a patch either by
+using its name or by a number.  By name is obvious enough; pass the
+name \filename{foo.patch} to \hgxcmd{mq}{qpush}, for example, and it will
+push patches until \filename{foo.patch} is applied.
+As a shortcut, you can refer to a patch using both a name and a
+numeric offset; \texttt{foo.patch-2} means ``two patches before
+\texttt{foo.patch}'', while \texttt{bar.patch+4} means ``four patches
+after \texttt{bar.patch}''.
+Referring to a patch by index isn't much different.  The first patch
+printed in the output of \hgxcmd{mq}{qseries} is patch zero (yes, it's one
+of those start-at-zero counting systems); the second is patch one; and
+so on.
+MQ also makes it easy to work with patches when you are using normal
+Mercurial commands.  Every command that accepts a changeset ID will
+also accept the name of an applied patch.  MQ augments the tags
+normally in the repository with an eponymous one for each applied
+patch.  In addition, the special tags \index{tags!special tag
+names!\texttt{qbase}}\texttt{qbase} and \index{tags!special tag
+names!\texttt{qtip}}\texttt{qtip} identify the ``bottom-most'' and
+topmost applied patches, respectively.
+These additions to Mercurial's normal tagging capabilities make
+dealing with patches even more of a breeze.
+\begin{itemize}
+\item Want to patchbomb a mailing list with your latest series of
+changes?
+\begin{codesample4}
+hg email qbase:qtip
+\end{codesample4}
+(Don't know what ``patchbombing'' is?  See
+section~\ref{sec:hgext:patchbomb}.)
+\item Need to see all of the patches since \texttt{foo.patch} that
+have touched files in a subdirectory of your tree?
+\begin{codesample4}
+hg log -r foo.patch:qtip \emph{subdir}
+\end{codesample4}
+\end{itemize}
+Because MQ makes the names of patches available to the rest of
+Mercurial through its normal internal tag machinery, you don't need to
+type in the entire name of a patch when you want to identify it by
+name.
+\begin{figure}[ht]
+\interaction{mq.id.output}
+\caption{Using MQ's tag features to work with patches}
+\label{ex:mq:id}
+\end{figure}
+Another nice consequence of representing patch names as tags is that
+when you run the \hgcmd{log} command, it will display a patch's name
+as a tag, simply as part of its normal output.  This makes it easy to
+visually distinguish applied patches from underlying ``normal''
+revisions.  Figure~\ref{ex:mq:id} shows a few normal Mercurial
+commands in use with applied patches.
+\section{Useful things to know about}
+There are a number of aspects of MQ usage that don't fit tidily into
+sections of their own, but that are good to know.  Here they are, in
+one place.
+\begin{itemize}
+\item Normally, when you \hgxcmd{mq}{qpop} a patch and \hgxcmd{mq}{qpush} it
+again, the changeset that represents the patch after the pop/push
+will have a \emph{different identity} than the changeset that
+represented the hash beforehand.  See
+section~\ref{sec:mqref:cmd:qpush} for information as to why this is.
+\item It's not a good idea to \hgcmd{merge} changes from another
+branch with a patch changeset, at least if you want to maintain the
+``patchiness'' of that changeset and changesets below it on the
+patch stack.  If you try to do this, it will appear to succeed, but
+MQ will become confused.
+\end{itemize}
+\section{Managing patches in a repository}
+\label{sec:mq:repo}
+Because MQ's \sdirname{.hg/patches} directory resides outside a
+Mercurial repository's working directory, the ``underlying'' Mercurial
+repository knows nothing about the management or presence of patches.
+This presents the interesting possibility of managing the contents of
+the patch directory as a Mercurial repository in its own right.  This
+can be a useful way to work.  For example, you can work on a patch for
+a while, \hgxcmd{mq}{qrefresh} it, then \hgcmd{commit} the current state of
+the patch.  This lets you ``roll back'' to that version of the patch
+later on.
+You can then share different versions of the same patch stack among
+multiple underlying repositories.  I use this when I am developing a
+Linux kernel feature.  I have a pristine copy of my kernel sources for
+each of several CPU architectures, and a cloned repository under each
+that contains the patches I am working on.  When I want to test a
+change on a different architecture, I push my current patches to the
+patch repository associated with that kernel tree, pop and push all of
+my patches, and build and test that kernel.
+Managing patches in a repository makes it possible for multiple
+developers to work on the same patch series without colliding with
+each other, all on top of an underlying source base that they may or
+may not control.
+\subsection{MQ support for patch repositories}
+MQ helps you to work with the \sdirname{.hg/patches} directory as a
+repository; when you prepare a repository for working with patches
+using \hgxcmd{mq}{qinit}, you can pass the \hgxopt{mq}{qinit}{-c} option to
+create the \sdirname{.hg/patches} directory as a Mercurial repository.
+\begin{note}
+If you forget to use the \hgxopt{mq}{qinit}{-c} option, you can simply go
+into the \sdirname{.hg/patches} directory at any time and run
+\hgcmd{init}.  Don't forget to add an entry for the
+\sfilename{status} file to the \sfilename{.hgignore} file, though
+(\hgcmdargs{qinit}{\hgxopt{mq}{qinit}{-c}} does this for you
+automatically); you \emph{really} don't want to manage the
+\sfilename{status} file.
+\end{note}
+As a convenience, if MQ notices that the \dirname{.hg/patches}
+directory is a repository, it will automatically \hgcmd{add} every
+patch that you create and import.
+MQ provides a shortcut command, \hgxcmd{mq}{qcommit}, that runs
+\hgcmd{commit} in the \sdirname{.hg/patches} directory.  This saves
+some bothersome typing.
+Finally, as a convenience to manage the patch directory, you can
+define the alias \command{mq} on Unix systems. For example, on Linux
+systems using the \command{bash} shell, you can include the following
+snippet in your \tildefile{.bashrc}.
+\begin{codesample2}
+alias mq=`hg -R \$(hg root)/.hg/patches'
+\end{codesample2}
+You can then issue commands of the form \cmdargs{mq}{pull} from
+the main repository.
+\subsection{A few things to watch out for}
+MQ's support for working with a repository full of patches is limited
+in a few small respects.
+MQ cannot automatically detect changes that you make to the patch
+directory.  If you \hgcmd{pull}, manually edit, or \hgcmd{update}
+changes to patches or the \sfilename{series} file, you will have to
+\hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}} and then
+\hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-a}} in the underlying repository to
+see those changes show up there.  If you forget to do this, you can
+confuse MQ's idea of which patches are applied.
+\section{Third party tools for working with patches}
+\label{sec:mq:tools}
+Once you've been working with patches for a while, you'll find
+yourself hungry for tools that will help you to understand and
+manipulate the patches you're dealing with.
+The \command{diffstat} command~\cite{web:diffstat} generates a
+histogram of the modifications made to each file in a patch.  It
+provides a good way to ``get a sense of'' a patch---which files it
+affects, and how much change it introduces to each file and as a
+whole.  (I find that it's a good idea to use \command{diffstat}'s
+\cmdopt{diffstat}{-p} option as a matter of course, as otherwise it
+will try to do clever things with prefixes of file names that
+inevitably confuse at least me.)
+\begin{figure}[ht]
+\interaction{mq.tools.tools}
+\caption{The \command{diffstat}, \command{filterdiff}, and \command{lsdiff} commands}
+\label{ex:mq:tools}
+\end{figure}
+The \package{patchutils} package~\cite{web:patchutils} is invaluable.
+It provides a set of small utilities that follow the ``Unix
+philosophy;'' each does one useful thing with a patch.  The
+\package{patchutils} command I use most is \command{filterdiff}, which
+extracts subsets from a patch file.  For example, given a patch that
+modifies hundreds of files across dozens of directories, a single
+invocation of \command{filterdiff} can generate a smaller patch that
+only touches files whose names match a particular glob pattern.  See
+section~\ref{mq-collab:tips:interdiff} for another example.
+\section{Good ways to work with patches}
+Whether you are working on a patch series to submit to a free software
+or open source project, or a series that you intend to treat as a
+sequence of regular changesets when you're done, you can use some
+simple techniques to keep your work well organised.
+Give your patches descriptive names.  A good name for a patch might be
+\filename{rework-device-alloc.patch}, because it will immediately give
+you a hint what the purpose of the patch is.  Long names shouldn't be
+a problem; you won't be typing the names often, but you \emph{will} be
+running commands like \hgxcmd{mq}{qapplied} and \hgxcmd{mq}{qtop} over and over.
+Good naming becomes especially important when you have a number of
+patches to work with, or if you are juggling a number of different
+tasks and your patches only get a fraction of your attention.
+Be aware of what patch you're working on.  Use the \hgxcmd{mq}{qtop}
+command and skim over the text of your patches frequently---for
+example, using \hgcmdargs{tip}{\hgopt{tip}{-p}})---to be sure of where
+you stand.  I have several times worked on and \hgxcmd{mq}{qrefresh}ed a
+patch other than the one I intended, and it's often tricky to migrate
+changes into the right patch after making them in the wrong one.
+For this reason, it is very much worth investing a little time to
+learn how to use some of the third-party tools I described in
+section~\ref{sec:mq:tools}, particularly \command{diffstat} and
+\command{filterdiff}.  The former will give you a quick idea of what
+changes your patch is making, while the latter makes it easy to splice
+hunks selectively out of one patch and into another.
+\section{MQ cookbook}
+\subsection{Manage ``trivial'' patches}
+Because the overhead of dropping files into a new Mercurial repository
+is so low, it makes a lot of sense to manage patches this way even if
+you simply want to make a few changes to a source tarball that you
+downloaded.
+Begin by downloading and unpacking the source tarball,
+and turning it into a Mercurial repository.
+\interaction{mq.tarball.download}
+Continue by creating a patch stack and making your changes.
+\interaction{mq.tarball.qinit}
+Let's say a few weeks or months pass, and your package author releases
+a new version.  First, bring their changes into the repository.
+\interaction{mq.tarball.newsource}
+The pipeline starting with \hgcmd{locate} above deletes all files in
+the working directory, so that \hgcmd{commit}'s
+\hgopt{commit}{--addremove} option can actually tell which files have
+really been removed in the newer version of the source.
+Finally, you can apply your patches on top of the new tree.
+\interaction{mq.tarball.repush}
+\subsection{Combining entire patches}
+\label{sec:mq:combine}
+MQ provides a command, \hgxcmd{mq}{qfold} that lets you combine entire
+patches.  This ``folds'' the patches you name, in the order you name
+them, into the topmost applied patch, and concatenates their
+descriptions onto the end of its description.  The patches that you
+fold must be unapplied before you fold them.
+The order in which you fold patches matters.  If your topmost applied
+patch is \texttt{foo}, and you \hgxcmd{mq}{qfold} \texttt{bar} and
+\texttt{quux} into it, you will end up with a patch that has the same
+effect as if you applied first \texttt{foo}, then \texttt{bar},
+followed by \texttt{quux}.
+\subsection{Merging part of one patch into another}
+Merging \emph{part} of one patch into another is more difficult than
+combining entire patches.
+If you want to move changes to entire files, you can use
+\command{filterdiff}'s \cmdopt{filterdiff}{-i} and
+\cmdopt{filterdiff}{-x} options to choose the modifications to snip
+out of one patch, concatenating its output onto the end of the patch
+you want to merge into.  You usually won't need to modify the patch
+you've merged the changes from.  Instead, MQ will report some rejected
+hunks when you \hgxcmd{mq}{qpush} it (from the hunks you moved into the
+other patch), and you can simply \hgxcmd{mq}{qrefresh} the patch to drop
+the duplicate hunks.
+If you have a patch that has multiple hunks modifying a file, and you
+only want to move a few of those hunks, the job becomes more messy,
+but you can still partly automate it.  Use \cmdargs{lsdiff}{-nvv} to
+print some metadata about the patch.
+\interaction{mq.tools.lsdiff}
+This command prints three different kinds of number:
+\begin{itemize}
+\item (in the first column) a \emph{file number} to identify each file
+modified in the patch;
+\item (on the next line, indented) the line number within a modified
+file where a hunk starts; and
+\item (on the same line) a \emph{hunk number} to identify that hunk.
+\end{itemize}
+You'll have to use some visual inspection, and reading of the patch,
+to identify the file and hunk numbers you'll want, but you can then
+pass them to to \command{filterdiff}'s \cmdopt{filterdiff}{--files}
+and \cmdopt{filterdiff}{--hunks} options, to select exactly the file
+and hunk you want to extract.
+Once you have this hunk, you can concatenate it onto the end of your
+destination patch and continue with the remainder of
+section~\ref{sec:mq:combine}.
+\section{Differences between quilt and MQ}
+If you are already familiar with quilt, MQ provides a similar command
+set.  There are a few differences in the way that it works.
+You will already have noticed that most quilt commands have MQ
+counterparts that simply begin with a ``\texttt{q}''.  The exceptions
+are quilt's \texttt{add} and \texttt{remove} commands, the
+counterparts for which are the normal Mercurial \hgcmd{add} and
+\hgcmd{remove} commands.  There is no MQ equivalent of the quilt
+\texttt{edit} command.
+%%% Local Variables:
+%%% mode: latex
+%%% TeX-master: "00book"
+%%% End:

Mercurial > hgbook

comparison en/ch12-mq.tex @ 649:5cd47f721686