Mercurial > hgbook
comparison en/ch13-mq-collab.tex @ 649:5cd47f721686
Rename LaTeX input files to have numeric prefixes
author | Bryan O'Sullivan <bos@serpentine.com> |
---|---|
date | Thu, 29 Jan 2009 22:56:27 -0800 |
parents | en/mq-collab.tex@97e929385442 |
children | f72b7e6cbe90 |
comparison
equal
deleted
inserted
replaced
648:bc14f94e726a | 649:5cd47f721686 |
---|---|
1 \chapter{Advanced uses of Mercurial Queues} | |
2 \label{chap:mq-collab} | |
3 | |
4 While it's easy to pick up straightforward uses of Mercurial Queues, | |
5 use of a little discipline and some of MQ's less frequently used | |
6 capabilities makes it possible to work in complicated development | |
7 environments. | |
8 | |
9 In this chapter, I will use as an example a technique I have used to | |
10 manage the development of an Infiniband device driver for the Linux | |
11 kernel. The driver in question is large (at least as drivers go), | |
12 with 25,000 lines of code spread across 35 source files. It is | |
13 maintained by a small team of developers. | |
14 | |
15 While much of the material in this chapter is specific to Linux, the | |
16 same principles apply to any code base for which you're not the | |
17 primary owner, and upon which you need to do a lot of development. | |
18 | |
19 \section{The problem of many targets} | |
20 | |
21 The Linux kernel changes rapidly, and has never been internally | |
22 stable; developers frequently make drastic changes between releases. | |
23 This means that a version of the driver that works well with a | |
24 particular released version of the kernel will not even \emph{compile} | |
25 correctly against, typically, any other version. | |
26 | |
27 To maintain a driver, we have to keep a number of distinct versions of | |
28 Linux in mind. | |
29 \begin{itemize} | |
30 \item One target is the main Linux kernel development tree. | |
31 Maintenance of the code is in this case partly shared by other | |
32 developers in the kernel community, who make ``drive-by'' | |
33 modifications to the driver as they develop and refine kernel | |
34 subsystems. | |
35 \item We also maintain a number of ``backports'' to older versions of | |
36 the Linux kernel, to support the needs of customers who are running | |
37 older Linux distributions that do not incorporate our drivers. (To | |
38 \emph{backport} a piece of code is to modify it to work in an older | |
39 version of its target environment than the version it was developed | |
40 for.) | |
41 \item Finally, we make software releases on a schedule that is | |
42 necessarily not aligned with those used by Linux distributors and | |
43 kernel developers, so that we can deliver new features to customers | |
44 without forcing them to upgrade their entire kernels or | |
45 distributions. | |
46 \end{itemize} | |
47 | |
48 \subsection{Tempting approaches that don't work well} | |
49 | |
50 There are two ``standard'' ways to maintain a piece of software that | |
51 has to target many different environments. | |
52 | |
53 The first is to maintain a number of branches, each intended for a | |
54 single target. The trouble with this approach is that you must | |
55 maintain iron discipline in the flow of changes between repositories. | |
56 A new feature or bug fix must start life in a ``pristine'' repository, | |
57 then percolate out to every backport repository. Backport changes are | |
58 more limited in the branches they should propagate to; a backport | |
59 change that is applied to a branch where it doesn't belong will | |
60 probably stop the driver from compiling. | |
61 | |
62 The second is to maintain a single source tree filled with conditional | |
63 statements that turn chunks of code on or off depending on the | |
64 intended target. Because these ``ifdefs'' are not allowed in the | |
65 Linux kernel tree, a manual or automatic process must be followed to | |
66 strip them out and yield a clean tree. A code base maintained in this | |
67 fashion rapidly becomes a rat's nest of conditional blocks that are | |
68 difficult to understand and maintain. | |
69 | |
70 Neither of these approaches is well suited to a situation where you | |
71 don't ``own'' the canonical copy of a source tree. In the case of a | |
72 Linux driver that is distributed with the standard kernel, Linus's | |
73 tree contains the copy of the code that will be treated by the world | |
74 as canonical. The upstream version of ``my'' driver can be modified | |
75 by people I don't know, without me even finding out about it until | |
76 after the changes show up in Linus's tree. | |
77 | |
78 These approaches have the added weakness of making it difficult to | |
79 generate well-formed patches to submit upstream. | |
80 | |
81 In principle, Mercurial Queues seems like a good candidate to manage a | |
82 development scenario such as the above. While this is indeed the | |
83 case, MQ contains a few added features that make the job more | |
84 pleasant. | |
85 | |
86 \section{Conditionally applying patches with | |
87 guards} | |
88 | |
89 Perhaps the best way to maintain sanity with so many targets is to be | |
90 able to choose specific patches to apply for a given situation. MQ | |
91 provides a feature called ``guards'' (which originates with quilt's | |
92 \texttt{guards} command) that does just this. To start off, let's | |
93 create a simple repository for experimenting in. | |
94 \interaction{mq.guards.init} | |
95 This gives us a tiny repository that contains two patches that don't | |
96 have any dependencies on each other, because they touch different files. | |
97 | |
98 The idea behind conditional application is that you can ``tag'' a | |
99 patch with a \emph{guard}, which is simply a text string of your | |
100 choosing, then tell MQ to select specific guards to use when applying | |
101 patches. MQ will then either apply, or skip over, a guarded patch, | |
102 depending on the guards that you have selected. | |
103 | |
104 A patch can have an arbitrary number of guards; | |
105 each one is \emph{positive} (``apply this patch if this guard is | |
106 selected'') or \emph{negative} (``skip this patch if this guard is | |
107 selected''). A patch with no guards is always applied. | |
108 | |
109 \section{Controlling the guards on a patch} | |
110 | |
111 The \hgxcmd{mq}{qguard} command lets you determine which guards should | |
112 apply to a patch, or display the guards that are already in effect. | |
113 Without any arguments, it displays the guards on the current topmost | |
114 patch. | |
115 \interaction{mq.guards.qguard} | |
116 To set a positive guard on a patch, prefix the name of the guard with | |
117 a ``\texttt{+}''. | |
118 \interaction{mq.guards.qguard.pos} | |
119 To set a negative guard on a patch, prefix the name of the guard with | |
120 a ``\texttt{-}''. | |
121 \interaction{mq.guards.qguard.neg} | |
122 | |
123 \begin{note} | |
124 The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it | |
125 doesn't \emph{modify} them. What this means is that if you run | |
126 \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on | |
127 the same patch, the \emph{only} guard that will be set on it | |
128 afterwards is \texttt{+c}. | |
129 \end{note} | |
130 | |
131 Mercurial stores guards in the \sfilename{series} file; the form in | |
132 which they are stored is easy both to understand and to edit by hand. | |
133 (In other words, you don't have to use the \hgxcmd{mq}{qguard} command if | |
134 you don't want to; it's okay to simply edit the \sfilename{series} | |
135 file.) | |
136 \interaction{mq.guards.series} | |
137 | |
138 \section{Selecting the guards to use} | |
139 | |
140 The \hgxcmd{mq}{qselect} command determines which guards are active at a | |
141 given time. The effect of this is to determine which patches MQ will | |
142 apply the next time you run \hgxcmd{mq}{qpush}. It has no other effect; in | |
143 particular, it doesn't do anything to patches that are already | |
144 applied. | |
145 | |
146 With no arguments, the \hgxcmd{mq}{qselect} command lists the guards | |
147 currently in effect, one per line of output. Each argument is treated | |
148 as the name of a guard to apply. | |
149 \interaction{mq.guards.qselect.foo} | |
150 In case you're interested, the currently selected guards are stored in | |
151 the \sfilename{guards} file. | |
152 \interaction{mq.guards.qselect.cat} | |
153 We can see the effect the selected guards have when we run | |
154 \hgxcmd{mq}{qpush}. | |
155 \interaction{mq.guards.qselect.qpush} | |
156 | |
157 A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}'' | |
158 character. The name of a guard must not contain white space, but most | |
159 other characters are acceptable. If you try to use a guard with an | |
160 invalid name, MQ will complain: | |
161 \interaction{mq.guards.qselect.error} | |
162 Changing the selected guards changes the patches that are applied. | |
163 \interaction{mq.guards.qselect.quux} | |
164 You can see in the example below that negative guards take precedence | |
165 over positive guards. | |
166 \interaction{mq.guards.qselect.foobar} | |
167 | |
168 \section{MQ's rules for applying patches} | |
169 | |
170 The rules that MQ uses when deciding whether to apply a patch | |
171 are as follows. | |
172 \begin{itemize} | |
173 \item A patch that has no guards is always applied. | |
174 \item If the patch has any negative guard that matches any currently | |
175 selected guard, the patch is skipped. | |
176 \item If the patch has any positive guard that matches any currently | |
177 selected guard, the patch is applied. | |
178 \item If the patch has positive or negative guards, but none matches | |
179 any currently selected guard, the patch is skipped. | |
180 \end{itemize} | |
181 | |
182 \section{Trimming the work environment} | |
183 | |
184 In working on the device driver I mentioned earlier, I don't apply the | |
185 patches to a normal Linux kernel tree. Instead, I use a repository | |
186 that contains only a snapshot of the source files and headers that are | |
187 relevant to Infiniband development. This repository is~1\% the size | |
188 of a kernel repository, so it's easier to work with. | |
189 | |
190 I then choose a ``base'' version on top of which the patches are | |
191 applied. This is a snapshot of the Linux kernel tree as of a revision | |
192 of my choosing. When I take the snapshot, I record the changeset ID | |
193 from the kernel repository in the commit message. Since the snapshot | |
194 preserves the ``shape'' and content of the relevant parts of the | |
195 kernel tree, I can apply my patches on top of either my tiny | |
196 repository or a normal kernel tree. | |
197 | |
198 Normally, the base tree atop which the patches apply should be a | |
199 snapshot of a very recent upstream tree. This best facilitates the | |
200 development of patches that can easily be submitted upstream with few | |
201 or no modifications. | |
202 | |
203 \section{Dividing up the \sfilename{series} file} | |
204 | |
205 I categorise the patches in the \sfilename{series} file into a number | |
206 of logical groups. Each section of like patches begins with a block | |
207 of comments that describes the purpose of the patches that follow. | |
208 | |
209 The sequence of patch groups that I maintain follows. The ordering of | |
210 these groups is important; I'll describe why after I introduce the | |
211 groups. | |
212 \begin{itemize} | |
213 \item The ``accepted'' group. Patches that the development team has | |
214 submitted to the maintainer of the Infiniband subsystem, and which | |
215 he has accepted, but which are not present in the snapshot that the | |
216 tiny repository is based on. These are ``read only'' patches, | |
217 present only to transform the tree into a similar state as it is in | |
218 the upstream maintainer's repository. | |
219 \item The ``rework'' group. Patches that I have submitted, but that | |
220 the upstream maintainer has requested modifications to before he | |
221 will accept them. | |
222 \item The ``pending'' group. Patches that I have not yet submitted to | |
223 the upstream maintainer, but which we have finished working on. | |
224 These will be ``read only'' for a while. If the upstream maintainer | |
225 accepts them upon submission, I'll move them to the end of the | |
226 ``accepted'' group. If he requests that I modify any, I'll move | |
227 them to the beginning of the ``rework'' group. | |
228 \item The ``in progress'' group. Patches that are actively being | |
229 developed, and should not be submitted anywhere yet. | |
230 \item The ``backport'' group. Patches that adapt the source tree to | |
231 older versions of the kernel tree. | |
232 \item The ``do not ship'' group. Patches that for some reason should | |
233 never be submitted upstream. For example, one such patch might | |
234 change embedded driver identification strings to make it easier to | |
235 distinguish, in the field, between an out-of-tree version of the | |
236 driver and a version shipped by a distribution vendor. | |
237 \end{itemize} | |
238 | |
239 Now to return to the reasons for ordering groups of patches in this | |
240 way. We would like the lowest patches in the stack to be as stable as | |
241 possible, so that we will not need to rework higher patches due to | |
242 changes in context. Putting patches that will never be changed first | |
243 in the \sfilename{series} file serves this purpose. | |
244 | |
245 We would also like the patches that we know we'll need to modify to be | |
246 applied on top of a source tree that resembles the upstream tree as | |
247 closely as possible. This is why we keep accepted patches around for | |
248 a while. | |
249 | |
250 The ``backport'' and ``do not ship'' patches float at the end of the | |
251 \sfilename{series} file. The backport patches must be applied on top | |
252 of all other patches, and the ``do not ship'' patches might as well | |
253 stay out of harm's way. | |
254 | |
255 \section{Maintaining the patch series} | |
256 | |
257 In my work, I use a number of guards to control which patches are to | |
258 be applied. | |
259 | |
260 \begin{itemize} | |
261 \item ``Accepted'' patches are guarded with \texttt{accepted}. I | |
262 enable this guard most of the time. When I'm applying the patches | |
263 on top of a tree where the patches are already present, I can turn | |
264 this patch off, and the patches that follow it will apply cleanly. | |
265 \item Patches that are ``finished'', but not yet submitted, have no | |
266 guards. If I'm applying the patch stack to a copy of the upstream | |
267 tree, I don't need to enable any guards in order to get a reasonably | |
268 safe source tree. | |
269 \item Those patches that need reworking before being resubmitted are | |
270 guarded with \texttt{rework}. | |
271 \item For those patches that are still under development, I use | |
272 \texttt{devel}. | |
273 \item A backport patch may have several guards, one for each version | |
274 of the kernel to which it applies. For example, a patch that | |
275 backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard. | |
276 \end{itemize} | |
277 This variety of guards gives me considerable flexibility in | |
278 determining what kind of source tree I want to end up with. For most | |
279 situations, the selection of appropriate guards is automated during | |
280 the build process, but I can manually tune the guards to use for less | |
281 common circumstances. | |
282 | |
283 \subsection{The art of writing backport patches} | |
284 | |
285 Using MQ, writing a backport patch is a simple process. All such a | |
286 patch has to do is modify a piece of code that uses a kernel feature | |
287 not present in the older version of the kernel, so that the driver | |
288 continues to work correctly under that older version. | |
289 | |
290 A useful goal when writing a good backport patch is to make your code | |
291 look as if it was written for the older version of the kernel you're | |
292 targeting. The less obtrusive the patch, the easier it will be to | |
293 understand and maintain. If you're writing a collection of backport | |
294 patches to avoid the ``rat's nest'' effect of lots of | |
295 \texttt{\#ifdef}s (hunks of source code that are only used | |
296 conditionally) in your code, don't introduce version-dependent | |
297 \texttt{\#ifdef}s into the patches. Instead, write several patches, | |
298 each of which makes unconditional changes, and control their | |
299 application using guards. | |
300 | |
301 There are two reasons to divide backport patches into a distinct | |
302 group, away from the ``regular'' patches whose effects they modify. | |
303 The first is that intermingling the two makes it more difficult to use | |
304 a tool like the \hgext{patchbomb} extension to automate the process of | |
305 submitting the patches to an upstream maintainer. The second is that | |
306 a backport patch could perturb the context in which a subsequent | |
307 regular patch is applied, making it impossible to apply the regular | |
308 patch cleanly \emph{without} the earlier backport patch already being | |
309 applied. | |
310 | |
311 \section{Useful tips for developing with MQ} | |
312 | |
313 \subsection{Organising patches in directories} | |
314 | |
315 If you're working on a substantial project with MQ, it's not difficult | |
316 to accumulate a large number of patches. For example, I have one | |
317 patch repository that contains over 250 patches. | |
318 | |
319 If you can group these patches into separate logical categories, you | |
320 can if you like store them in different directories; MQ has no | |
321 problems with patch names that contain path separators. | |
322 | |
323 \subsection{Viewing the history of a patch} | |
324 \label{mq-collab:tips:interdiff} | |
325 | |
326 If you're developing a set of patches over a long time, it's a good | |
327 idea to maintain them in a repository, as discussed in | |
328 section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that | |
329 using the \hgcmd{diff} command to look at the history of changes to a | |
330 patch is unworkable. This is in part because you're looking at the | |
331 second derivative of the real code (a diff of a diff), but also | |
332 because MQ adds noise to the process by modifying time stamps and | |
333 directory names when it updates a patch. | |
334 | |
335 However, you can use the \hgext{extdiff} extension, which is bundled | |
336 with Mercurial, to turn a diff of two versions of a patch into | |
337 something readable. To do this, you will need a third-party package | |
338 called \package{patchutils}~\cite{web:patchutils}. This provides a | |
339 command named \command{interdiff}, which shows the differences between | |
340 two diffs as a diff. Used on two versions of the same diff, it | |
341 generates a diff that represents the diff from the first to the second | |
342 version. | |
343 | |
344 You can enable the \hgext{extdiff} extension in the usual way, by | |
345 adding a line to the \rcsection{extensions} section of your \hgrc. | |
346 \begin{codesample2} | |
347 [extensions] | |
348 extdiff = | |
349 \end{codesample2} | |
350 The \command{interdiff} command expects to be passed the names of two | |
351 files, but the \hgext{extdiff} extension passes the program it runs a | |
352 pair of directories, each of which can contain an arbitrary number of | |
353 files. We thus need a small program that will run \command{interdiff} | |
354 on each pair of files in these two directories. This program is | |
355 available as \sfilename{hg-interdiff} in the \dirname{examples} | |
356 directory of the source code repository that accompanies this book. | |
357 \excode{hg-interdiff} | |
358 | |
359 With the \sfilename{hg-interdiff} program in your shell's search path, | |
360 you can run it as follows, from inside an MQ patch directory: | |
361 \begin{codesample2} | |
362 hg extdiff -p hg-interdiff -r A:B my-change.patch | |
363 \end{codesample2} | |
364 Since you'll probably want to use this long-winded command a lot, you | |
365 can get \hgext{hgext} to make it available as a normal Mercurial | |
366 command, again by editing your \hgrc. | |
367 \begin{codesample2} | |
368 [extdiff] | |
369 cmd.interdiff = hg-interdiff | |
370 \end{codesample2} | |
371 This directs \hgext{hgext} to make an \texttt{interdiff} command | |
372 available, so you can now shorten the previous invocation of | |
373 \hgxcmd{extdiff}{extdiff} to something a little more wieldy. | |
374 \begin{codesample2} | |
375 hg interdiff -r A:B my-change.patch | |
376 \end{codesample2} | |
377 | |
378 \begin{note} | |
379 The \command{interdiff} command works well only if the underlying | |
380 files against which versions of a patch are generated remain the | |
381 same. If you create a patch, modify the underlying files, and then | |
382 regenerate the patch, \command{interdiff} may not produce useful | |
383 output. | |
384 \end{note} | |
385 | |
386 The \hgext{extdiff} extension is useful for more than merely improving | |
387 the presentation of MQ~patches. To read more about it, go to | |
388 section~\ref{sec:hgext:extdiff}. | |
389 | |
390 %%% Local Variables: | |
391 %%% mode: latex | |
392 %%% TeX-master: "00book" | |
393 %%% End: |