comparison en/ch12-mq.tex @ 649:5cd47f721686

Rename LaTeX input files to have numeric prefixes
author Bryan O'Sullivan <bos@serpentine.com>
date Thu, 29 Jan 2009 22:56:27 -0800
parents en/mq.tex@97e929385442
children
comparison
equal deleted inserted replaced
648:bc14f94e726a 649:5cd47f721686
1 \chapter{Managing change with Mercurial Queues}
2 \label{chap:mq}
3
4 \section{The patch management problem}
5 \label{sec:mq:patch-mgmt}
6
7 Here is a common scenario: you need to install a software package from
8 source, but you find a bug that you must fix in the source before you
9 can start using the package. You make your changes, forget about the
10 package for a while, and a few months later you need to upgrade to a
11 newer version of the package. If the newer version of the package
12 still has the bug, you must extract your fix from the older source
13 tree and apply it against the newer version. This is a tedious task,
14 and it's easy to make mistakes.
15
16 This is a simple case of the ``patch management'' problem. You have
17 an ``upstream'' source tree that you can't change; you need to make
18 some local changes on top of the upstream tree; and you'd like to be
19 able to keep those changes separate, so that you can apply them to
20 newer versions of the upstream source.
21
22 The patch management problem arises in many situations. Probably the
23 most visible is that a user of an open source software project will
24 contribute a bug fix or new feature to the project's maintainers in the
25 form of a patch.
26
27 Distributors of operating systems that include open source software
28 often need to make changes to the packages they distribute so that
29 they will build properly in their environments.
30
31 When you have few changes to maintain, it is easy to manage a single
32 patch using the standard \command{diff} and \command{patch} programs
33 (see section~\ref{sec:mq:patch} for a discussion of these tools).
34 Once the number of changes grows, it starts to make sense to maintain
35 patches as discrete ``chunks of work,'' so that for example a single
36 patch will contain only one bug fix (the patch might modify several
37 files, but it's doing ``only one thing''), and you may have a number
38 of such patches for different bugs you need fixed and local changes
39 you require. In this situation, if you submit a bug fix patch to the
40 upstream maintainers of a package and they include your fix in a
41 subsequent release, you can simply drop that single patch when you're
42 updating to the newer release.
43
44 Maintaining a single patch against an upstream tree is a little
45 tedious and error-prone, but not difficult. However, the complexity
46 of the problem grows rapidly as the number of patches you have to
47 maintain increases. With more than a tiny number of patches in hand,
48 understanding which ones you have applied and maintaining them moves
49 from messy to overwhelming.
50
51 Fortunately, Mercurial includes a powerful extension, Mercurial Queues
52 (or simply ``MQ''), that massively simplifies the patch management
53 problem.
54
55 \section{The prehistory of Mercurial Queues}
56 \label{sec:mq:history}
57
58 During the late 1990s, several Linux kernel developers started to
59 maintain ``patch series'' that modified the behaviour of the Linux
60 kernel. Some of these series were focused on stability, some on
61 feature coverage, and others were more speculative.
62
63 The sizes of these patch series grew rapidly. In 2002, Andrew Morton
64 published some shell scripts he had been using to automate the task of
65 managing his patch queues. Andrew was successfully using these
66 scripts to manage hundreds (sometimes thousands) of patches on top of
67 the Linux kernel.
68
69 \subsection{A patchwork quilt}
70 \label{sec:mq:quilt}
71
72 In early 2003, Andreas Gruenbacher and Martin Quinson borrowed the
73 approach of Andrew's scripts and published a tool called ``patchwork
74 quilt''~\cite{web:quilt}, or simply ``quilt''
75 (see~\cite{gruenbacher:2005} for a paper describing it). Because
76 quilt substantially automated patch management, it rapidly gained a
77 large following among open source software developers.
78
79 Quilt manages a \emph{stack of patches} on top of a directory tree.
80 To begin, you tell quilt to manage a directory tree, and tell it which
81 files you want to manage; it stores away the names and contents of
82 those files. To fix a bug, you create a new patch (using a single
83 command), edit the files you need to fix, then ``refresh'' the patch.
84
85 The refresh step causes quilt to scan the directory tree; it updates
86 the patch with all of the changes you have made. You can create
87 another patch on top of the first, which will track the changes
88 required to modify the tree from ``tree with one patch applied'' to
89 ``tree with two patches applied''.
90
91 You can \emph{change} which patches are applied to the tree. If you
92 ``pop'' a patch, the changes made by that patch will vanish from the
93 directory tree. Quilt remembers which patches you have popped,
94 though, so you can ``push'' a popped patch again, and the directory
95 tree will be restored to contain the modifications in the patch. Most
96 importantly, you can run the ``refresh'' command at any time, and the
97 topmost applied patch will be updated. This means that you can, at
98 any time, change both which patches are applied and what
99 modifications those patches make.
100
101 Quilt knows nothing about revision control tools, so it works equally
102 well on top of an unpacked tarball or a Subversion working copy.
103
104 \subsection{From patchwork quilt to Mercurial Queues}
105 \label{sec:mq:quilt-mq}
106
107 In mid-2005, Chris Mason took the features of quilt and wrote an
108 extension that he called Mercurial Queues, which added quilt-like
109 behaviour to Mercurial.
110
111 The key difference between quilt and MQ is that quilt knows nothing
112 about revision control systems, while MQ is \emph{integrated} into
113 Mercurial. Each patch that you push is represented as a Mercurial
114 changeset. Pop a patch, and the changeset goes away.
115
116 Because quilt does not care about revision control tools, it is still
117 a tremendously useful piece of software to know about for situations
118 where you cannot use Mercurial and MQ.
119
120 \section{The huge advantage of MQ}
121
122 I cannot overstate the value that MQ offers through the unification of
123 patches and revision control.
124
125 A major reason that patches have persisted in the free software and
126 open source world---in spite of the availability of increasingly
127 capable revision control tools over the years---is the \emph{agility}
128 they offer.
129
130 Traditional revision control tools make a permanent, irreversible
131 record of everything that you do. While this has great value, it's
132 also somewhat stifling. If you want to perform a wild-eyed
133 experiment, you have to be careful in how you go about it, or you risk
134 leaving unneeded---or worse, misleading or destabilising---traces of
135 your missteps and errors in the permanent revision record.
136
137 By contrast, MQ's marriage of distributed revision control with
138 patches makes it much easier to isolate your work. Your patches live
139 on top of normal revision history, and you can make them disappear or
140 reappear at will. If you don't like a patch, you can drop it. If a
141 patch isn't quite as you want it to be, simply fix it---as many times
142 as you need to, until you have refined it into the form you desire.
143
144 As an example, the integration of patches with revision control makes
145 understanding patches and debugging their effects---and their
146 interplay with the code they're based on---\emph{enormously} easier.
147 Since every applied patch has an associated changeset, you can use
148 \hgcmdargs{log}{\emph{filename}} to see which changesets and patches
149 affected a file. You can use the \hgext{bisect} command to
150 binary-search through all changesets and applied patches to see where
151 a bug got introduced or fixed. You can use the \hgcmd{annotate}
152 command to see which changeset or patch modified a particular line of
153 a source file. And so on.
154
155 \section{Understanding patches}
156 \label{sec:mq:patch}
157
158 Because MQ doesn't hide its patch-oriented nature, it is helpful to
159 understand what patches are, and a little about the tools that work
160 with them.
161
162 The traditional Unix \command{diff} command compares two files, and
163 prints a list of differences between them. The \command{patch} command
164 understands these differences as \emph{modifications} to make to a
165 file. Take a look at figure~\ref{ex:mq:diff} for a simple example of
166 these commands in action.
167
168 \begin{figure}[ht]
169 \interaction{mq.dodiff.diff}
170 \caption{Simple uses of the \command{diff} and \command{patch} commands}
171 \label{ex:mq:diff}
172 \end{figure}
173
174 The type of file that \command{diff} generates (and \command{patch}
175 takes as input) is called a ``patch'' or a ``diff''; there is no
176 difference between a patch and a diff. (We'll use the term ``patch'',
177 since it's more commonly used.)
178
179 A patch file can start with arbitrary text; the \command{patch}
180 command ignores this text, but MQ uses it as the commit message when
181 creating changesets. To find the beginning of the patch content,
182 \command{patch} searches for the first line that starts with the
183 string ``\texttt{diff~-}''.
184
185 MQ works with \emph{unified} diffs (\command{patch} can accept several
186 other diff formats, but MQ doesn't). A unified diff contains two
187 kinds of header. The \emph{file header} describes the file being
188 modified; it contains the name of the file to modify. When
189 \command{patch} sees a new file header, it looks for a file with that
190 name to start modifying.
191
192 After the file header comes a series of \emph{hunks}. Each hunk
193 starts with a header; this identifies the range of line numbers within
194 the file that the hunk should modify. Following the header, a hunk
195 starts and ends with a few (usually three) lines of text from the
196 unmodified file; these are called the \emph{context} for the hunk. If
197 there's only a small amount of context between successive hunks,
198 \command{diff} doesn't print a new hunk header; it just runs the hunks
199 together, with a few lines of context between modifications.
200
201 Each line of context begins with a space character. Within the hunk,
202 a line that begins with ``\texttt{-}'' means ``remove this line,''
203 while a line that begins with ``\texttt{+}'' means ``insert this
204 line.'' For example, a line that is modified is represented by one
205 deletion and one insertion.
206
207 We will return to some of the more subtle aspects of patches later (in
208 section~\ref{sec:mq:adv-patch}), but you should have enough information
209 now to use MQ.
210
211 \section{Getting started with Mercurial Queues}
212 \label{sec:mq:start}
213
214 Because MQ is implemented as an extension, you must explicitly enable
215 before you can use it. (You don't need to download anything; MQ ships
216 with the standard Mercurial distribution.) To enable MQ, edit your
217 \tildefile{.hgrc} file, and add the lines in figure~\ref{ex:mq:config}.
218
219 \begin{figure}[ht]
220 \begin{codesample4}
221 [extensions]
222 hgext.mq =
223 \end{codesample4}
224 \label{ex:mq:config}
225 \caption{Contents to add to \tildefile{.hgrc} to enable the MQ extension}
226 \end{figure}
227
228 Once the extension is enabled, it will make a number of new commands
229 available. To verify that the extension is working, you can use
230 \hgcmd{help} to see if the \hgxcmd{mq}{qinit} command is now available; see
231 the example in figure~\ref{ex:mq:enabled}.
232
233 \begin{figure}[ht]
234 \interaction{mq.qinit-help.help}
235 \caption{How to verify that MQ is enabled}
236 \label{ex:mq:enabled}
237 \end{figure}
238
239 You can use MQ with \emph{any} Mercurial repository, and its commands
240 only operate within that repository. To get started, simply prepare
241 the repository using the \hgxcmd{mq}{qinit} command (see
242 figure~\ref{ex:mq:qinit}). This command creates an empty directory
243 called \sdirname{.hg/patches}, where MQ will keep its metadata. As
244 with many Mercurial commands, the \hgxcmd{mq}{qinit} command prints nothing
245 if it succeeds.
246
247 \begin{figure}[ht]
248 \interaction{mq.tutorial.qinit}
249 \caption{Preparing a repository for use with MQ}
250 \label{ex:mq:qinit}
251 \end{figure}
252
253 \begin{figure}[ht]
254 \interaction{mq.tutorial.qnew}
255 \caption{Creating a new patch}
256 \label{ex:mq:qnew}
257 \end{figure}
258
259 \subsection{Creating a new patch}
260
261 To begin work on a new patch, use the \hgxcmd{mq}{qnew} command. This
262 command takes one argument, the name of the patch to create. MQ will
263 use this as the name of an actual file in the \sdirname{.hg/patches}
264 directory, as you can see in figure~\ref{ex:mq:qnew}.
265
266 Also newly present in the \sdirname{.hg/patches} directory are two
267 other files, \sfilename{series} and \sfilename{status}. The
268 \sfilename{series} file lists all of the patches that MQ knows about
269 for this repository, with one patch per line. Mercurial uses the
270 \sfilename{status} file for internal book-keeping; it tracks all of the
271 patches that MQ has \emph{applied} in this repository.
272
273 \begin{note}
274 You may sometimes want to edit the \sfilename{series} file by hand;
275 for example, to change the sequence in which some patches are
276 applied. However, manually editing the \sfilename{status} file is
277 almost always a bad idea, as it's easy to corrupt MQ's idea of what
278 is happening.
279 \end{note}
280
281 Once you have created your new patch, you can edit files in the
282 working directory as you usually would. All of the normal Mercurial
283 commands, such as \hgcmd{diff} and \hgcmd{annotate}, work exactly as
284 they did before.
285
286 \subsection{Refreshing a patch}
287
288 When you reach a point where you want to save your work, use the
289 \hgxcmd{mq}{qrefresh} command (figure~\ref{ex:mq:qnew}) to update the patch
290 you are working on. This command folds the changes you have made in
291 the working directory into your patch, and updates its corresponding
292 changeset to contain those changes.
293
294 \begin{figure}[ht]
295 \interaction{mq.tutorial.qrefresh}
296 \caption{Refreshing a patch}
297 \label{ex:mq:qrefresh}
298 \end{figure}
299
300 You can run \hgxcmd{mq}{qrefresh} as often as you like, so it's a good way
301 to ``checkpoint'' your work. Refresh your patch at an opportune
302 time; try an experiment; and if the experiment doesn't work out,
303 \hgcmd{revert} your modifications back to the last time you refreshed.
304
305 \begin{figure}[ht]
306 \interaction{mq.tutorial.qrefresh2}
307 \caption{Refresh a patch many times to accumulate changes}
308 \label{ex:mq:qrefresh2}
309 \end{figure}
310
311 \subsection{Stacking and tracking patches}
312
313 Once you have finished working on a patch, or need to work on another,
314 you can use the \hgxcmd{mq}{qnew} command again to create a new patch.
315 Mercurial will apply this patch on top of your existing patch. See
316 figure~\ref{ex:mq:qnew2} for an example. Notice that the patch
317 contains the changes in our prior patch as part of its context (you
318 can see this more clearly in the output of \hgcmd{annotate}).
319
320 \begin{figure}[ht]
321 \interaction{mq.tutorial.qnew2}
322 \caption{Stacking a second patch on top of the first}
323 \label{ex:mq:qnew2}
324 \end{figure}
325
326 So far, with the exception of \hgxcmd{mq}{qnew} and \hgxcmd{mq}{qrefresh}, we've
327 been careful to only use regular Mercurial commands. However, MQ
328 provides many commands that are easier to use when you are thinking
329 about patches, as illustrated in figure~\ref{ex:mq:qseries}:
330
331 \begin{itemize}
332 \item The \hgxcmd{mq}{qseries} command lists every patch that MQ knows
333 about in this repository, from oldest to newest (most recently
334 \emph{created}).
335 \item The \hgxcmd{mq}{qapplied} command lists every patch that MQ has
336 \emph{applied} in this repository, again from oldest to newest (most
337 recently applied).
338 \end{itemize}
339
340 \begin{figure}[ht]
341 \interaction{mq.tutorial.qseries}
342 \caption{Understanding the patch stack with \hgxcmd{mq}{qseries} and
343 \hgxcmd{mq}{qapplied}}
344 \label{ex:mq:qseries}
345 \end{figure}
346
347 \subsection{Manipulating the patch stack}
348
349 The previous discussion implied that there must be a difference
350 between ``known'' and ``applied'' patches, and there is. MQ can
351 manage a patch without it being applied in the repository.
352
353 An \emph{applied} patch has a corresponding changeset in the
354 repository, and the effects of the patch and changeset are visible in
355 the working directory. You can undo the application of a patch using
356 the \hgxcmd{mq}{qpop} command. MQ still \emph{knows about}, or manages, a
357 popped patch, but the patch no longer has a corresponding changeset in
358 the repository, and the working directory does not contain the changes
359 made by the patch. Figure~\ref{fig:mq:stack} illustrates the
360 difference between applied and tracked patches.
361
362 \begin{figure}[ht]
363 \centering
364 \grafix{mq-stack}
365 \caption{Applied and unapplied patches in the MQ patch stack}
366 \label{fig:mq:stack}
367 \end{figure}
368
369 You can reapply an unapplied, or popped, patch using the \hgxcmd{mq}{qpush}
370 command. This creates a new changeset to correspond to the patch, and
371 the patch's changes once again become present in the working
372 directory. See figure~\ref{ex:mq:qpop} for examples of \hgxcmd{mq}{qpop}
373 and \hgxcmd{mq}{qpush} in action. Notice that once we have popped a patch
374 or two patches, the output of \hgxcmd{mq}{qseries} remains the same, while
375 that of \hgxcmd{mq}{qapplied} has changed.
376
377 \begin{figure}[ht]
378 \interaction{mq.tutorial.qpop}
379 \caption{Modifying the stack of applied patches}
380 \label{ex:mq:qpop}
381 \end{figure}
382
383 \subsection{Pushing and popping many patches}
384
385 While \hgxcmd{mq}{qpush} and \hgxcmd{mq}{qpop} each operate on a single patch at
386 a time by default, you can push and pop many patches in one go. The
387 \hgxopt{mq}{qpush}{-a} option to \hgxcmd{mq}{qpush} causes it to push all
388 unapplied patches, while the \hgxopt{mq}{qpop}{-a} option to \hgxcmd{mq}{qpop}
389 causes it to pop all applied patches. (For some more ways to push and
390 pop many patches, see section~\ref{sec:mq:perf} below.)
391
392 \begin{figure}[ht]
393 \interaction{mq.tutorial.qpush-a}
394 \caption{Pushing all unapplied patches}
395 \label{ex:mq:qpush-a}
396 \end{figure}
397
398 \subsection{Safety checks, and overriding them}
399
400 Several MQ commands check the working directory before they do
401 anything, and fail if they find any modifications. They do this to
402 ensure that you won't lose any changes that you have made, but not yet
403 incorporated into a patch. Figure~\ref{ex:mq:add} illustrates this;
404 the \hgxcmd{mq}{qnew} command will not create a new patch if there are
405 outstanding changes, caused in this case by the \hgcmd{add} of
406 \filename{file3}.
407
408 \begin{figure}[ht]
409 \interaction{mq.tutorial.add}
410 \caption{Forcibly creating a patch}
411 \label{ex:mq:add}
412 \end{figure}
413
414 Commands that check the working directory all take an ``I know what
415 I'm doing'' option, which is always named \option{-f}. The exact
416 meaning of \option{-f} depends on the command. For example,
417 \hgcmdargs{qnew}{\hgxopt{mq}{qnew}{-f}} will incorporate any outstanding
418 changes into the new patch it creates, but
419 \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-f}} will revert modifications to any
420 files affected by the patch that it is popping. Be sure to read the
421 documentation for a command's \option{-f} option before you use it!
422
423 \subsection{Working on several patches at once}
424
425 The \hgxcmd{mq}{qrefresh} command always refreshes the \emph{topmost}
426 applied patch. This means that you can suspend work on one patch (by
427 refreshing it), pop or push to make a different patch the top, and
428 work on \emph{that} patch for a while.
429
430 Here's an example that illustrates how you can use this ability.
431 Let's say you're developing a new feature as two patches. The first
432 is a change to the core of your software, and the second---layered on
433 top of the first---changes the user interface to use the code you just
434 added to the core. If you notice a bug in the core while you're
435 working on the UI patch, it's easy to fix the core. Simply
436 \hgxcmd{mq}{qrefresh} the UI patch to save your in-progress changes, and
437 \hgxcmd{mq}{qpop} down to the core patch. Fix the core bug,
438 \hgxcmd{mq}{qrefresh} the core patch, and \hgxcmd{mq}{qpush} back to the UI
439 patch to continue where you left off.
440
441 \section{More about patches}
442 \label{sec:mq:adv-patch}
443
444 MQ uses the GNU \command{patch} command to apply patches, so it's
445 helpful to know a few more detailed aspects of how \command{patch}
446 works, and about patches themselves.
447
448 \subsection{The strip count}
449
450 If you look at the file headers in a patch, you will notice that the
451 pathnames usually have an extra component on the front that isn't
452 present in the actual path name. This is a holdover from the way that
453 people used to generate patches (people still do this, but it's
454 somewhat rare with modern revision control tools).
455
456 Alice would unpack a tarball, edit her files, then decide that she
457 wanted to create a patch. So she'd rename her working directory,
458 unpack the tarball again (hence the need for the rename), and use the
459 \cmdopt{diff}{-r} and \cmdopt{diff}{-N} options to \command{diff} to
460 recursively generate a patch between the unmodified directory and the
461 modified one. The result would be that the name of the unmodified
462 directory would be at the front of the left-hand path in every file
463 header, and the name of the modified directory would be at the front
464 of the right-hand path.
465
466 Since someone receiving a patch from the Alices of the net would be
467 unlikely to have unmodified and modified directories with exactly the
468 same names, the \command{patch} command has a \cmdopt{patch}{-p}
469 option that indicates the number of leading path name components to
470 strip when trying to apply a patch. This number is called the
471 \emph{strip count}.
472
473 An option of ``\texttt{-p1}'' means ``use a strip count of one''. If
474 \command{patch} sees a file name \filename{foo/bar/baz} in a file
475 header, it will strip \filename{foo} and try to patch a file named
476 \filename{bar/baz}. (Strictly speaking, the strip count refers to the
477 number of \emph{path separators} (and the components that go with them
478 ) to strip. A strip count of one will turn \filename{foo/bar} into
479 \filename{bar}, but \filename{/foo/bar} (notice the extra leading
480 slash) into \filename{foo/bar}.)
481
482 The ``standard'' strip count for patches is one; almost all patches
483 contain one leading path name component that needs to be stripped.
484 Mercurial's \hgcmd{diff} command generates path names in this form,
485 and the \hgcmd{import} command and MQ expect patches to have a strip
486 count of one.
487
488 If you receive a patch from someone that you want to add to your patch
489 queue, and the patch needs a strip count other than one, you cannot
490 just \hgxcmd{mq}{qimport} the patch, because \hgxcmd{mq}{qimport} does not yet
491 have a \texttt{-p} option (see~\bug{311}). Your best bet is to
492 \hgxcmd{mq}{qnew} a patch of your own, then use \cmdargs{patch}{-p\emph{N}}
493 to apply their patch, followed by \hgcmd{addremove} to pick up any
494 files added or removed by the patch, followed by \hgxcmd{mq}{qrefresh}.
495 This complexity may become unnecessary; see~\bug{311} for details.
496 \subsection{Strategies for applying a patch}
497
498 When \command{patch} applies a hunk, it tries a handful of
499 successively less accurate strategies to try to make the hunk apply.
500 This falling-back technique often makes it possible to take a patch
501 that was generated against an old version of a file, and apply it
502 against a newer version of that file.
503
504 First, \command{patch} tries an exact match, where the line numbers,
505 the context, and the text to be modified must apply exactly. If it
506 cannot make an exact match, it tries to find an exact match for the
507 context, without honouring the line numbering information. If this
508 succeeds, it prints a line of output saying that the hunk was applied,
509 but at some \emph{offset} from the original line number.
510
511 If a context-only match fails, \command{patch} removes the first and
512 last lines of the context, and tries a \emph{reduced} context-only
513 match. If the hunk with reduced context succeeds, it prints a message
514 saying that it applied the hunk with a \emph{fuzz factor} (the number
515 after the fuzz factor indicates how many lines of context
516 \command{patch} had to trim before the patch applied).
517
518 When neither of these techniques works, \command{patch} prints a
519 message saying that the hunk in question was rejected. It saves
520 rejected hunks (also simply called ``rejects'') to a file with the
521 same name, and an added \sfilename{.rej} extension. It also saves an
522 unmodified copy of the file with a \sfilename{.orig} extension; the
523 copy of the file without any extensions will contain any changes made
524 by hunks that \emph{did} apply cleanly. If you have a patch that
525 modifies \filename{foo} with six hunks, and one of them fails to
526 apply, you will have: an unmodified \filename{foo.orig}, a
527 \filename{foo.rej} containing one hunk, and \filename{foo}, containing
528 the changes made by the five successful hunks.
529
530 \subsection{Some quirks of patch representation}
531
532 There are a few useful things to know about how \command{patch} works
533 with files.
534 \begin{itemize}
535 \item This should already be obvious, but \command{patch} cannot
536 handle binary files.
537 \item Neither does it care about the executable bit; it creates new
538 files as readable, but not executable.
539 \item \command{patch} treats the removal of a file as a diff between
540 the file to be removed and the empty file. So your idea of ``I
541 deleted this file'' looks like ``every line of this file was
542 deleted'' in a patch.
543 \item It treats the addition of a file as a diff between the empty
544 file and the file to be added. So in a patch, your idea of ``I
545 added this file'' looks like ``every line of this file was added''.
546 \item It treats a renamed file as the removal of the old name, and the
547 addition of the new name. This means that renamed files have a big
548 footprint in patches. (Note also that Mercurial does not currently
549 try to infer when files have been renamed or copied in a patch.)
550 \item \command{patch} cannot represent empty files, so you cannot use
551 a patch to represent the notion ``I added this empty file to the
552 tree''.
553 \end{itemize}
554 \subsection{Beware the fuzz}
555
556 While applying a hunk at an offset, or with a fuzz factor, will often
557 be completely successful, these inexact techniques naturally leave
558 open the possibility of corrupting the patched file. The most common
559 cases typically involve applying a patch twice, or at an incorrect
560 location in the file. If \command{patch} or \hgxcmd{mq}{qpush} ever
561 mentions an offset or fuzz factor, you should make sure that the
562 modified files are correct afterwards.
563
564 It's often a good idea to refresh a patch that has applied with an
565 offset or fuzz factor; refreshing the patch generates new context
566 information that will make it apply cleanly. I say ``often,'' not
567 ``always,'' because sometimes refreshing a patch will make it fail to
568 apply against a different revision of the underlying files. In some
569 cases, such as when you're maintaining a patch that must sit on top of
570 multiple versions of a source tree, it's acceptable to have a patch
571 apply with some fuzz, provided you've verified the results of the
572 patching process in such cases.
573
574 \subsection{Handling rejection}
575
576 If \hgxcmd{mq}{qpush} fails to apply a patch, it will print an error
577 message and exit. If it has left \sfilename{.rej} files behind, it is
578 usually best to fix up the rejected hunks before you push more patches
579 or do any further work.
580
581 If your patch \emph{used to} apply cleanly, and no longer does because
582 you've changed the underlying code that your patches are based on,
583 Mercurial Queues can help; see section~\ref{sec:mq:merge} for details.
584
585 Unfortunately, there aren't any great techniques for dealing with
586 rejected hunks. Most often, you'll need to view the \sfilename{.rej}
587 file and edit the target file, applying the rejected hunks by hand.
588
589 If you're feeling adventurous, Neil Brown, a Linux kernel hacker,
590 wrote a tool called \command{wiggle}~\cite{web:wiggle}, which is more
591 vigorous than \command{patch} in its attempts to make a patch apply.
592
593 Another Linux kernel hacker, Chris Mason (the author of Mercurial
594 Queues), wrote a similar tool called
595 \command{mpatch}~\cite{web:mpatch}, which takes a simple approach to
596 automating the application of hunks rejected by \command{patch}. The
597 \command{mpatch} command can help with four common reasons that a hunk
598 may be rejected:
599
600 \begin{itemize}
601 \item The context in the middle of a hunk has changed.
602 \item A hunk is missing some context at the beginning or end.
603 \item A large hunk might apply better---either entirely or in
604 part---if it was broken up into smaller hunks.
605 \item A hunk removes lines with slightly different content than those
606 currently present in the file.
607 \end{itemize}
608
609 If you use \command{wiggle} or \command{mpatch}, you should be doubly
610 careful to check your results when you're done. In fact,
611 \command{mpatch} enforces this method of double-checking the tool's
612 output, by automatically dropping you into a merge program when it has
613 done its job, so that you can verify its work and finish off any
614 remaining merges.
615
616 \section{Getting the best performance out of MQ}
617 \label{sec:mq:perf}
618
619 MQ is very efficient at handling a large number of patches. I ran
620 some performance experiments in mid-2006 for a talk that I gave at the
621 2006 EuroPython conference~\cite{web:europython}. I used as my data
622 set the Linux 2.6.17-mm1 patch series, which consists of 1,738
623 patches. I applied these on top of a Linux kernel repository
624 containing all 27,472 revisions between Linux 2.6.12-rc2 and Linux
625 2.6.17.
626
627 On my old, slow laptop, I was able to
628 \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-a}} all 1,738 patches in 3.5 minutes,
629 and \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}} them all in 30 seconds. (On a
630 newer laptop, the time to push all patches dropped to two minutes.) I
631 could \hgxcmd{mq}{qrefresh} one of the biggest patches (which made 22,779
632 lines of changes to 287 files) in 6.6 seconds.
633
634 Clearly, MQ is well suited to working in large trees, but there are a
635 few tricks you can use to get the best performance of it.
636
637 First of all, try to ``batch'' operations together. Every time you
638 run \hgxcmd{mq}{qpush} or \hgxcmd{mq}{qpop}, these commands scan the working
639 directory once to make sure you haven't made some changes and then
640 forgotten to run \hgxcmd{mq}{qrefresh}. On a small tree, the time that
641 this scan takes is unnoticeable. However, on a medium-sized tree
642 (containing tens of thousands of files), it can take a second or more.
643
644 The \hgxcmd{mq}{qpush} and \hgxcmd{mq}{qpop} commands allow you to push and pop
645 multiple patches at a time. You can identify the ``destination
646 patch'' that you want to end up at. When you \hgxcmd{mq}{qpush} with a
647 destination specified, it will push patches until that patch is at the
648 top of the applied stack. When you \hgxcmd{mq}{qpop} to a destination, MQ
649 will pop patches until the destination patch is at the top.
650
651 You can identify a destination patch using either the name of the
652 patch, or by number. If you use numeric addressing, patches are
653 counted from zero; this means that the first patch is zero, the second
654 is one, and so on.
655
656 \section{Updating your patches when the underlying code changes}
657 \label{sec:mq:merge}
658
659 It's common to have a stack of patches on top of an underlying
660 repository that you don't modify directly. If you're working on
661 changes to third-party code, or on a feature that is taking longer to
662 develop than the rate of change of the code beneath, you will often
663 need to sync up with the underlying code, and fix up any hunks in your
664 patches that no longer apply. This is called \emph{rebasing} your
665 patch series.
666
667 The simplest way to do this is to \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}}
668 your patches, then \hgcmd{pull} changes into the underlying
669 repository, and finally \hgcmdargs{qpush}{\hgxopt{mq}{qpop}{-a}} your
670 patches again. MQ will stop pushing any time it runs across a patch
671 that fails to apply during conflicts, allowing you to fix your
672 conflicts, \hgxcmd{mq}{qrefresh} the affected patch, and continue pushing
673 until you have fixed your entire stack.
674
675 This approach is easy to use and works well if you don't expect
676 changes to the underlying code to affect how well your patches apply.
677 If your patch stack touches code that is modified frequently or
678 invasively in the underlying repository, however, fixing up rejected
679 hunks by hand quickly becomes tiresome.
680
681 It's possible to partially automate the rebasing process. If your
682 patches apply cleanly against some revision of the underlying repo, MQ
683 can use this information to help you to resolve conflicts between your
684 patches and a different revision.
685
686 The process is a little involved.
687 \begin{enumerate}
688 \item To begin, \hgcmdargs{qpush}{-a} all of your patches on top of
689 the revision where you know that they apply cleanly.
690 \item Save a backup copy of your patch directory using
691 \hgcmdargs{qsave}{\hgxopt{mq}{qsave}{-e} \hgxopt{mq}{qsave}{-c}}. This prints
692 the name of the directory that it has saved the patches in. It will
693 save the patches to a directory called
694 \sdirname{.hg/patches.\emph{N}}, where \texttt{\emph{N}} is a small
695 integer. It also commits a ``save changeset'' on top of your
696 applied patches; this is for internal book-keeping, and records the
697 states of the \sfilename{series} and \sfilename{status} files.
698 \item Use \hgcmd{pull} to bring new changes into the underlying
699 repository. (Don't run \hgcmdargs{pull}{-u}; see below for why.)
700 \item Update to the new tip revision, using
701 \hgcmdargs{update}{\hgopt{update}{-C}} to override the patches you
702 have pushed.
703 \item Merge all patches using \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-m}
704 \hgxopt{mq}{qpush}{-a}}. The \hgxopt{mq}{qpush}{-m} option to \hgxcmd{mq}{qpush}
705 tells MQ to perform a three-way merge if the patch fails to apply.
706 \end{enumerate}
707
708 During the \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-m}}, each patch in the
709 \sfilename{series} file is applied normally. If a patch applies with
710 fuzz or rejects, MQ looks at the queue you \hgxcmd{mq}{qsave}d, and
711 performs a three-way merge with the corresponding changeset. This
712 merge uses Mercurial's normal merge machinery, so it may pop up a GUI
713 merge tool to help you to resolve problems.
714
715 When you finish resolving the effects of a patch, MQ refreshes your
716 patch based on the result of the merge.
717
718 At the end of this process, your repository will have one extra head
719 from the old patch queue, and a copy of the old patch queue will be in
720 \sdirname{.hg/patches.\emph{N}}. You can remove the extra head using
721 \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a} \hgxopt{mq}{qpop}{-n} patches.\emph{N}}
722 or \hgcmd{strip}. You can delete \sdirname{.hg/patches.\emph{N}} once
723 you are sure that you no longer need it as a backup.
724
725 \section{Identifying patches}
726
727 MQ commands that work with patches let you refer to a patch either by
728 using its name or by a number. By name is obvious enough; pass the
729 name \filename{foo.patch} to \hgxcmd{mq}{qpush}, for example, and it will
730 push patches until \filename{foo.patch} is applied.
731
732 As a shortcut, you can refer to a patch using both a name and a
733 numeric offset; \texttt{foo.patch-2} means ``two patches before
734 \texttt{foo.patch}'', while \texttt{bar.patch+4} means ``four patches
735 after \texttt{bar.patch}''.
736
737 Referring to a patch by index isn't much different. The first patch
738 printed in the output of \hgxcmd{mq}{qseries} is patch zero (yes, it's one
739 of those start-at-zero counting systems); the second is patch one; and
740 so on.
741
742 MQ also makes it easy to work with patches when you are using normal
743 Mercurial commands. Every command that accepts a changeset ID will
744 also accept the name of an applied patch. MQ augments the tags
745 normally in the repository with an eponymous one for each applied
746 patch. In addition, the special tags \index{tags!special tag
747 names!\texttt{qbase}}\texttt{qbase} and \index{tags!special tag
748 names!\texttt{qtip}}\texttt{qtip} identify the ``bottom-most'' and
749 topmost applied patches, respectively.
750
751 These additions to Mercurial's normal tagging capabilities make
752 dealing with patches even more of a breeze.
753 \begin{itemize}
754 \item Want to patchbomb a mailing list with your latest series of
755 changes?
756 \begin{codesample4}
757 hg email qbase:qtip
758 \end{codesample4}
759 (Don't know what ``patchbombing'' is? See
760 section~\ref{sec:hgext:patchbomb}.)
761 \item Need to see all of the patches since \texttt{foo.patch} that
762 have touched files in a subdirectory of your tree?
763 \begin{codesample4}
764 hg log -r foo.patch:qtip \emph{subdir}
765 \end{codesample4}
766 \end{itemize}
767
768 Because MQ makes the names of patches available to the rest of
769 Mercurial through its normal internal tag machinery, you don't need to
770 type in the entire name of a patch when you want to identify it by
771 name.
772
773 \begin{figure}[ht]
774 \interaction{mq.id.output}
775 \caption{Using MQ's tag features to work with patches}
776 \label{ex:mq:id}
777 \end{figure}
778
779 Another nice consequence of representing patch names as tags is that
780 when you run the \hgcmd{log} command, it will display a patch's name
781 as a tag, simply as part of its normal output. This makes it easy to
782 visually distinguish applied patches from underlying ``normal''
783 revisions. Figure~\ref{ex:mq:id} shows a few normal Mercurial
784 commands in use with applied patches.
785
786 \section{Useful things to know about}
787
788 There are a number of aspects of MQ usage that don't fit tidily into
789 sections of their own, but that are good to know. Here they are, in
790 one place.
791
792 \begin{itemize}
793 \item Normally, when you \hgxcmd{mq}{qpop} a patch and \hgxcmd{mq}{qpush} it
794 again, the changeset that represents the patch after the pop/push
795 will have a \emph{different identity} than the changeset that
796 represented the hash beforehand. See
797 section~\ref{sec:mqref:cmd:qpush} for information as to why this is.
798 \item It's not a good idea to \hgcmd{merge} changes from another
799 branch with a patch changeset, at least if you want to maintain the
800 ``patchiness'' of that changeset and changesets below it on the
801 patch stack. If you try to do this, it will appear to succeed, but
802 MQ will become confused.
803 \end{itemize}
804
805 \section{Managing patches in a repository}
806 \label{sec:mq:repo}
807
808 Because MQ's \sdirname{.hg/patches} directory resides outside a
809 Mercurial repository's working directory, the ``underlying'' Mercurial
810 repository knows nothing about the management or presence of patches.
811
812 This presents the interesting possibility of managing the contents of
813 the patch directory as a Mercurial repository in its own right. This
814 can be a useful way to work. For example, you can work on a patch for
815 a while, \hgxcmd{mq}{qrefresh} it, then \hgcmd{commit} the current state of
816 the patch. This lets you ``roll back'' to that version of the patch
817 later on.
818
819 You can then share different versions of the same patch stack among
820 multiple underlying repositories. I use this when I am developing a
821 Linux kernel feature. I have a pristine copy of my kernel sources for
822 each of several CPU architectures, and a cloned repository under each
823 that contains the patches I am working on. When I want to test a
824 change on a different architecture, I push my current patches to the
825 patch repository associated with that kernel tree, pop and push all of
826 my patches, and build and test that kernel.
827
828 Managing patches in a repository makes it possible for multiple
829 developers to work on the same patch series without colliding with
830 each other, all on top of an underlying source base that they may or
831 may not control.
832
833 \subsection{MQ support for patch repositories}
834
835 MQ helps you to work with the \sdirname{.hg/patches} directory as a
836 repository; when you prepare a repository for working with patches
837 using \hgxcmd{mq}{qinit}, you can pass the \hgxopt{mq}{qinit}{-c} option to
838 create the \sdirname{.hg/patches} directory as a Mercurial repository.
839
840 \begin{note}
841 If you forget to use the \hgxopt{mq}{qinit}{-c} option, you can simply go
842 into the \sdirname{.hg/patches} directory at any time and run
843 \hgcmd{init}. Don't forget to add an entry for the
844 \sfilename{status} file to the \sfilename{.hgignore} file, though
845
846 (\hgcmdargs{qinit}{\hgxopt{mq}{qinit}{-c}} does this for you
847 automatically); you \emph{really} don't want to manage the
848 \sfilename{status} file.
849 \end{note}
850
851 As a convenience, if MQ notices that the \dirname{.hg/patches}
852 directory is a repository, it will automatically \hgcmd{add} every
853 patch that you create and import.
854
855 MQ provides a shortcut command, \hgxcmd{mq}{qcommit}, that runs
856 \hgcmd{commit} in the \sdirname{.hg/patches} directory. This saves
857 some bothersome typing.
858
859 Finally, as a convenience to manage the patch directory, you can
860 define the alias \command{mq} on Unix systems. For example, on Linux
861 systems using the \command{bash} shell, you can include the following
862 snippet in your \tildefile{.bashrc}.
863
864 \begin{codesample2}
865 alias mq=`hg -R \$(hg root)/.hg/patches'
866 \end{codesample2}
867
868 You can then issue commands of the form \cmdargs{mq}{pull} from
869 the main repository.
870
871 \subsection{A few things to watch out for}
872
873 MQ's support for working with a repository full of patches is limited
874 in a few small respects.
875
876 MQ cannot automatically detect changes that you make to the patch
877 directory. If you \hgcmd{pull}, manually edit, or \hgcmd{update}
878 changes to patches or the \sfilename{series} file, you will have to
879 \hgcmdargs{qpop}{\hgxopt{mq}{qpop}{-a}} and then
880 \hgcmdargs{qpush}{\hgxopt{mq}{qpush}{-a}} in the underlying repository to
881 see those changes show up there. If you forget to do this, you can
882 confuse MQ's idea of which patches are applied.
883
884 \section{Third party tools for working with patches}
885 \label{sec:mq:tools}
886
887 Once you've been working with patches for a while, you'll find
888 yourself hungry for tools that will help you to understand and
889 manipulate the patches you're dealing with.
890
891 The \command{diffstat} command~\cite{web:diffstat} generates a
892 histogram of the modifications made to each file in a patch. It
893 provides a good way to ``get a sense of'' a patch---which files it
894 affects, and how much change it introduces to each file and as a
895 whole. (I find that it's a good idea to use \command{diffstat}'s
896 \cmdopt{diffstat}{-p} option as a matter of course, as otherwise it
897 will try to do clever things with prefixes of file names that
898 inevitably confuse at least me.)
899
900 \begin{figure}[ht]
901 \interaction{mq.tools.tools}
902 \caption{The \command{diffstat}, \command{filterdiff}, and \command{lsdiff} commands}
903 \label{ex:mq:tools}
904 \end{figure}
905
906 The \package{patchutils} package~\cite{web:patchutils} is invaluable.
907 It provides a set of small utilities that follow the ``Unix
908 philosophy;'' each does one useful thing with a patch. The
909 \package{patchutils} command I use most is \command{filterdiff}, which
910 extracts subsets from a patch file. For example, given a patch that
911 modifies hundreds of files across dozens of directories, a single
912 invocation of \command{filterdiff} can generate a smaller patch that
913 only touches files whose names match a particular glob pattern. See
914 section~\ref{mq-collab:tips:interdiff} for another example.
915
916 \section{Good ways to work with patches}
917
918 Whether you are working on a patch series to submit to a free software
919 or open source project, or a series that you intend to treat as a
920 sequence of regular changesets when you're done, you can use some
921 simple techniques to keep your work well organised.
922
923 Give your patches descriptive names. A good name for a patch might be
924 \filename{rework-device-alloc.patch}, because it will immediately give
925 you a hint what the purpose of the patch is. Long names shouldn't be
926 a problem; you won't be typing the names often, but you \emph{will} be
927 running commands like \hgxcmd{mq}{qapplied} and \hgxcmd{mq}{qtop} over and over.
928 Good naming becomes especially important when you have a number of
929 patches to work with, or if you are juggling a number of different
930 tasks and your patches only get a fraction of your attention.
931
932 Be aware of what patch you're working on. Use the \hgxcmd{mq}{qtop}
933 command and skim over the text of your patches frequently---for
934 example, using \hgcmdargs{tip}{\hgopt{tip}{-p}})---to be sure of where
935 you stand. I have several times worked on and \hgxcmd{mq}{qrefresh}ed a
936 patch other than the one I intended, and it's often tricky to migrate
937 changes into the right patch after making them in the wrong one.
938
939 For this reason, it is very much worth investing a little time to
940 learn how to use some of the third-party tools I described in
941 section~\ref{sec:mq:tools}, particularly \command{diffstat} and
942 \command{filterdiff}. The former will give you a quick idea of what
943 changes your patch is making, while the latter makes it easy to splice
944 hunks selectively out of one patch and into another.
945
946 \section{MQ cookbook}
947
948 \subsection{Manage ``trivial'' patches}
949
950 Because the overhead of dropping files into a new Mercurial repository
951 is so low, it makes a lot of sense to manage patches this way even if
952 you simply want to make a few changes to a source tarball that you
953 downloaded.
954
955 Begin by downloading and unpacking the source tarball,
956 and turning it into a Mercurial repository.
957 \interaction{mq.tarball.download}
958
959 Continue by creating a patch stack and making your changes.
960 \interaction{mq.tarball.qinit}
961
962 Let's say a few weeks or months pass, and your package author releases
963 a new version. First, bring their changes into the repository.
964 \interaction{mq.tarball.newsource}
965 The pipeline starting with \hgcmd{locate} above deletes all files in
966 the working directory, so that \hgcmd{commit}'s
967 \hgopt{commit}{--addremove} option can actually tell which files have
968 really been removed in the newer version of the source.
969
970 Finally, you can apply your patches on top of the new tree.
971 \interaction{mq.tarball.repush}
972
973 \subsection{Combining entire patches}
974 \label{sec:mq:combine}
975
976 MQ provides a command, \hgxcmd{mq}{qfold} that lets you combine entire
977 patches. This ``folds'' the patches you name, in the order you name
978 them, into the topmost applied patch, and concatenates their
979 descriptions onto the end of its description. The patches that you
980 fold must be unapplied before you fold them.
981
982 The order in which you fold patches matters. If your topmost applied
983 patch is \texttt{foo}, and you \hgxcmd{mq}{qfold} \texttt{bar} and
984 \texttt{quux} into it, you will end up with a patch that has the same
985 effect as if you applied first \texttt{foo}, then \texttt{bar},
986 followed by \texttt{quux}.
987
988 \subsection{Merging part of one patch into another}
989
990 Merging \emph{part} of one patch into another is more difficult than
991 combining entire patches.
992
993 If you want to move changes to entire files, you can use
994 \command{filterdiff}'s \cmdopt{filterdiff}{-i} and
995 \cmdopt{filterdiff}{-x} options to choose the modifications to snip
996 out of one patch, concatenating its output onto the end of the patch
997 you want to merge into. You usually won't need to modify the patch
998 you've merged the changes from. Instead, MQ will report some rejected
999 hunks when you \hgxcmd{mq}{qpush} it (from the hunks you moved into the
1000 other patch), and you can simply \hgxcmd{mq}{qrefresh} the patch to drop
1001 the duplicate hunks.
1002
1003 If you have a patch that has multiple hunks modifying a file, and you
1004 only want to move a few of those hunks, the job becomes more messy,
1005 but you can still partly automate it. Use \cmdargs{lsdiff}{-nvv} to
1006 print some metadata about the patch.
1007 \interaction{mq.tools.lsdiff}
1008
1009 This command prints three different kinds of number:
1010 \begin{itemize}
1011 \item (in the first column) a \emph{file number} to identify each file
1012 modified in the patch;
1013 \item (on the next line, indented) the line number within a modified
1014 file where a hunk starts; and
1015 \item (on the same line) a \emph{hunk number} to identify that hunk.
1016 \end{itemize}
1017
1018 You'll have to use some visual inspection, and reading of the patch,
1019 to identify the file and hunk numbers you'll want, but you can then
1020 pass them to to \command{filterdiff}'s \cmdopt{filterdiff}{--files}
1021 and \cmdopt{filterdiff}{--hunks} options, to select exactly the file
1022 and hunk you want to extract.
1023
1024 Once you have this hunk, you can concatenate it onto the end of your
1025 destination patch and continue with the remainder of
1026 section~\ref{sec:mq:combine}.
1027
1028 \section{Differences between quilt and MQ}
1029
1030 If you are already familiar with quilt, MQ provides a similar command
1031 set. There are a few differences in the way that it works.
1032
1033 You will already have noticed that most quilt commands have MQ
1034 counterparts that simply begin with a ``\texttt{q}''. The exceptions
1035 are quilt's \texttt{add} and \texttt{remove} commands, the
1036 counterparts for which are the normal Mercurial \hgcmd{add} and
1037 \hgcmd{remove} commands. There is no MQ equivalent of the quilt
1038 \texttt{edit} command.
1039
1040 %%% Local Variables:
1041 %%% mode: latex
1042 %%% TeX-master: "00book"
1043 %%% End: