Mercurial > hgbook
diff es/mq-collab.tex @ 435:7e52f0cc4516
changed es/hgext.tex
changed es/hook.tex
changed es/kdiff3.png
changed es/license.tex
changed es/mq-collab.tex
changed es/mq-ref.tex
changed es/mq.tex
changed es/note.png
changed es/tour-merge.tex
changed es/undo-manual-merge.dot
changed es/undo-non-tip.dot
files needed to compile the pdf version of the book.
author | jerojasro@localhost |
---|---|
date | Sat, 18 Oct 2008 15:44:41 -0500 |
parents | 04c08ad7e92e |
children | 6cf30b3ed48f |
line wrap: on
line diff
--- a/es/mq-collab.tex Sat Oct 18 14:35:43 2008 -0500 +++ b/es/mq-collab.tex Sat Oct 18 15:44:41 2008 -0500 @@ -0,0 +1,393 @@ +\chapter{Advanced uses of Mercurial Queues} +\label{chap:mq-collab} + +While it's easy to pick up straightforward uses of Mercurial Queues, +use of a little discipline and some of MQ's less frequently used +capabilities makes it possible to work in complicated development +environments. + +In this chapter, I will use as an example a technique I have used to +manage the development of an Infiniband device driver for the Linux +kernel. The driver in question is large (at least as drivers go), +with 25,000 lines of code spread across 35 source files. It is +maintained by a small team of developers. + +While much of the material in this chapter is specific to Linux, the +same principles apply to any code base for which you're not the +primary owner, and upon which you need to do a lot of development. + +\section{The problem of many targets} + +The Linux kernel changes rapidly, and has never been internally +stable; developers frequently make drastic changes between releases. +This means that a version of the driver that works well with a +particular released version of the kernel will not even \emph{compile} +correctly against, typically, any other version. + +To maintain a driver, we have to keep a number of distinct versions of +Linux in mind. +\begin{itemize} +\item One target is the main Linux kernel development tree. + Maintenance of the code is in this case partly shared by other + developers in the kernel community, who make ``drive-by'' + modifications to the driver as they develop and refine kernel + subsystems. +\item We also maintain a number of ``backports'' to older versions of + the Linux kernel, to support the needs of customers who are running + older Linux distributions that do not incorporate our drivers. (To + \emph{backport} a piece of code is to modify it to work in an older + version of its target environment than the version it was developed + for.) +\item Finally, we make software releases on a schedule that is + necessarily not aligned with those used by Linux distributors and + kernel developers, so that we can deliver new features to customers + without forcing them to upgrade their entire kernels or + distributions. +\end{itemize} + +\subsection{Tempting approaches that don't work well} + +There are two ``standard'' ways to maintain a piece of software that +has to target many different environments. + +The first is to maintain a number of branches, each intended for a +single target. The trouble with this approach is that you must +maintain iron discipline in the flow of changes between repositories. +A new feature or bug fix must start life in a ``pristine'' repository, +then percolate out to every backport repository. Backport changes are +more limited in the branches they should propagate to; a backport +change that is applied to a branch where it doesn't belong will +probably stop the driver from compiling. + +The second is to maintain a single source tree filled with conditional +statements that turn chunks of code on or off depending on the +intended target. Because these ``ifdefs'' are not allowed in the +Linux kernel tree, a manual or automatic process must be followed to +strip them out and yield a clean tree. A code base maintained in this +fashion rapidly becomes a rat's nest of conditional blocks that are +difficult to understand and maintain. + +Neither of these approaches is well suited to a situation where you +don't ``own'' the canonical copy of a source tree. In the case of a +Linux driver that is distributed with the standard kernel, Linus's +tree contains the copy of the code that will be treated by the world +as canonical. The upstream version of ``my'' driver can be modified +by people I don't know, without me even finding out about it until +after the changes show up in Linus's tree. + +These approaches have the added weakness of making it difficult to +generate well-formed patches to submit upstream. + +In principle, Mercurial Queues seems like a good candidate to manage a +development scenario such as the above. While this is indeed the +case, MQ contains a few added features that make the job more +pleasant. + +\section{Conditionally applying patches with + guards} + +Perhaps the best way to maintain sanity with so many targets is to be +able to choose specific patches to apply for a given situation. MQ +provides a feature called ``guards'' (which originates with quilt's +\texttt{guards} command) that does just this. To start off, let's +create a simple repository for experimenting in. +\interaction{mq.guards.init} +This gives us a tiny repository that contains two patches that don't +have any dependencies on each other, because they touch different files. + +The idea behind conditional application is that you can ``tag'' a +patch with a \emph{guard}, which is simply a text string of your +choosing, then tell MQ to select specific guards to use when applying +patches. MQ will then either apply, or skip over, a guarded patch, +depending on the guards that you have selected. + +A patch can have an arbitrary number of guards; +each one is \emph{positive} (``apply this patch if this guard is +selected'') or \emph{negative} (``skip this patch if this guard is +selected''). A patch with no guards is always applied. + +\section{Controlling the guards on a patch} + +The \hgxcmd{mq}{qguard} command lets you determine which guards should +apply to a patch, or display the guards that are already in effect. +Without any arguments, it displays the guards on the current topmost +patch. +\interaction{mq.guards.qguard} +To set a positive guard on a patch, prefix the name of the guard with +a ``\texttt{+}''. +\interaction{mq.guards.qguard.pos} +To set a negative guard on a patch, prefix the name of the guard with +a ``\texttt{-}''. +\interaction{mq.guards.qguard.neg} + +\begin{note} + The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it + doesn't \emph{modify} them. What this means is that if you run + \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on + the same patch, the \emph{only} guard that will be set on it + afterwards is \texttt{+c}. +\end{note} + +Mercurial stores guards in the \sfilename{series} file; the form in +which they are stored is easy both to understand and to edit by hand. +(In other words, you don't have to use the \hgxcmd{mq}{qguard} command if +you don't want to; it's okay to simply edit the \sfilename{series} +file.) +\interaction{mq.guards.series} + +\section{Selecting the guards to use} + +The \hgxcmd{mq}{qselect} command determines which guards are active at a +given time. The effect of this is to determine which patches MQ will +apply the next time you run \hgxcmd{mq}{qpush}. It has no other effect; in +particular, it doesn't do anything to patches that are already +applied. + +With no arguments, the \hgxcmd{mq}{qselect} command lists the guards +currently in effect, one per line of output. Each argument is treated +as the name of a guard to apply. +\interaction{mq.guards.qselect.foo} +In case you're interested, the currently selected guards are stored in +the \sfilename{guards} file. +\interaction{mq.guards.qselect.cat} +We can see the effect the selected guards have when we run +\hgxcmd{mq}{qpush}. +\interaction{mq.guards.qselect.qpush} + +A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}'' +character. The name of a guard must not contain white space, but most +other characters are acceptable. If you try to use a guard with an +invalid name, MQ will complain: +\interaction{mq.guards.qselect.error} +Changing the selected guards changes the patches that are applied. +\interaction{mq.guards.qselect.quux} +You can see in the example below that negative guards take precedence +over positive guards. +\interaction{mq.guards.qselect.foobar} + +\section{MQ's rules for applying patches} + +The rules that MQ uses when deciding whether to apply a patch +are as follows. +\begin{itemize} +\item A patch that has no guards is always applied. +\item If the patch has any negative guard that matches any currently + selected guard, the patch is skipped. +\item If the patch has any positive guard that matches any currently + selected guard, the patch is applied. +\item If the patch has positive or negative guards, but none matches + any currently selected guard, the patch is skipped. +\end{itemize} + +\section{Trimming the work environment} + +In working on the device driver I mentioned earlier, I don't apply the +patches to a normal Linux kernel tree. Instead, I use a repository +that contains only a snapshot of the source files and headers that are +relevant to Infiniband development. This repository is~1\% the size +of a kernel repository, so it's easier to work with. + +I then choose a ``base'' version on top of which the patches are +applied. This is a snapshot of the Linux kernel tree as of a revision +of my choosing. When I take the snapshot, I record the changeset ID +from the kernel repository in the commit message. Since the snapshot +preserves the ``shape'' and content of the relevant parts of the +kernel tree, I can apply my patches on top of either my tiny +repository or a normal kernel tree. + +Normally, the base tree atop which the patches apply should be a +snapshot of a very recent upstream tree. This best facilitates the +development of patches that can easily be submitted upstream with few +or no modifications. + +\section{Dividing up the \sfilename{series} file} + +I categorise the patches in the \sfilename{series} file into a number +of logical groups. Each section of like patches begins with a block +of comments that describes the purpose of the patches that follow. + +The sequence of patch groups that I maintain follows. The ordering of +these groups is important; I'll describe why after I introduce the +groups. +\begin{itemize} +\item The ``accepted'' group. Patches that the development team has + submitted to the maintainer of the Infiniband subsystem, and which + he has accepted, but which are not present in the snapshot that the + tiny repository is based on. These are ``read only'' patches, + present only to transform the tree into a similar state as it is in + the upstream maintainer's repository. +\item The ``rework'' group. Patches that I have submitted, but that + the upstream maintainer has requested modifications to before he + will accept them. +\item The ``pending'' group. Patches that I have not yet submitted to + the upstream maintainer, but which we have finished working on. + These will be ``read only'' for a while. If the upstream maintainer + accepts them upon submission, I'll move them to the end of the + ``accepted'' group. If he requests that I modify any, I'll move + them to the beginning of the ``rework'' group. +\item The ``in progress'' group. Patches that are actively being + developed, and should not be submitted anywhere yet. +\item The ``backport'' group. Patches that adapt the source tree to + older versions of the kernel tree. +\item The ``do not ship'' group. Patches that for some reason should + never be submitted upstream. For example, one such patch might + change embedded driver identification strings to make it easier to + distinguish, in the field, between an out-of-tree version of the + driver and a version shipped by a distribution vendor. +\end{itemize} + +Now to return to the reasons for ordering groups of patches in this +way. We would like the lowest patches in the stack to be as stable as +possible, so that we will not need to rework higher patches due to +changes in context. Putting patches that will never be changed first +in the \sfilename{series} file serves this purpose. + +We would also like the patches that we know we'll need to modify to be +applied on top of a source tree that resembles the upstream tree as +closely as possible. This is why we keep accepted patches around for +a while. + +The ``backport'' and ``do not ship'' patches float at the end of the +\sfilename{series} file. The backport patches must be applied on top +of all other patches, and the ``do not ship'' patches might as well +stay out of harm's way. + +\section{Maintaining the patch series} + +In my work, I use a number of guards to control which patches are to +be applied. + +\begin{itemize} +\item ``Accepted'' patches are guarded with \texttt{accepted}. I + enable this guard most of the time. When I'm applying the patches + on top of a tree where the patches are already present, I can turn + this patch off, and the patches that follow it will apply cleanly. +\item Patches that are ``finished'', but not yet submitted, have no + guards. If I'm applying the patch stack to a copy of the upstream + tree, I don't need to enable any guards in order to get a reasonably + safe source tree. +\item Those patches that need reworking before being resubmitted are + guarded with \texttt{rework}. +\item For those patches that are still under development, I use + \texttt{devel}. +\item A backport patch may have several guards, one for each version + of the kernel to which it applies. For example, a patch that + backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard. +\end{itemize} +This variety of guards gives me considerable flexibility in +qdetermining what kind of source tree I want to end up with. For most +situations, the selection of appropriate guards is automated during +the build process, but I can manually tune the guards to use for less +common circumstances. + +\subsection{The art of writing backport patches} + +Using MQ, writing a backport patch is a simple process. All such a +patch has to do is modify a piece of code that uses a kernel feature +not present in the older version of the kernel, so that the driver +continues to work correctly under that older version. + +A useful goal when writing a good backport patch is to make your code +look as if it was written for the older version of the kernel you're +targeting. The less obtrusive the patch, the easier it will be to +understand and maintain. If you're writing a collection of backport +patches to avoid the ``rat's nest'' effect of lots of +\texttt{\#ifdef}s (hunks of source code that are only used +conditionally) in your code, don't introduce version-dependent +\texttt{\#ifdef}s into the patches. Instead, write several patches, +each of which makes unconditional changes, and control their +application using guards. + +There are two reasons to divide backport patches into a distinct +group, away from the ``regular'' patches whose effects they modify. +The first is that intermingling the two makes it more difficult to use +a tool like the \hgext{patchbomb} extension to automate the process of +submitting the patches to an upstream maintainer. The second is that +a backport patch could perturb the context in which a subsequent +regular patch is applied, making it impossible to apply the regular +patch cleanly \emph{without} the earlier backport patch already being +applied. + +\section{Useful tips for developing with MQ} + +\subsection{Organising patches in directories} + +If you're working on a substantial project with MQ, it's not difficult +to accumulate a large number of patches. For example, I have one +patch repository that contains over 250 patches. + +If you can group these patches into separate logical categories, you +can if you like store them in different directories; MQ has no +problems with patch names that contain path separators. + +\subsection{Viewing the history of a patch} +\label{mq-collab:tips:interdiff} + +If you're developing a set of patches over a long time, it's a good +idea to maintain them in a repository, as discussed in +section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that +using the \hgcmd{diff} command to look at the history of changes to a +patch is unworkable. This is in part because you're looking at the +second derivative of the real code (a diff of a diff), but also +because MQ adds noise to the process by modifying time stamps and +directory names when it updates a patch. + +However, you can use the \hgext{extdiff} extension, which is bundled +with Mercurial, to turn a diff of two versions of a patch into +something readable. To do this, you will need a third-party package +called \package{patchutils}~\cite{web:patchutils}. This provides a +command named \command{interdiff}, which shows the differences between +two diffs as a diff. Used on two versions of the same diff, it +generates a diff that represents the diff from the first to the second +version. + +You can enable the \hgext{extdiff} extension in the usual way, by +adding a line to the \rcsection{extensions} section of your \hgrc. +\begin{codesample2} + [extensions] + extdiff = +\end{codesample2} +The \command{interdiff} command expects to be passed the names of two +files, but the \hgext{extdiff} extension passes the program it runs a +pair of directories, each of which can contain an arbitrary number of +files. We thus need a small program that will run \command{interdiff} +on each pair of files in these two directories. This program is +available as \sfilename{hg-interdiff} in the \dirname{examples} +directory of the source code repository that accompanies this book. +\excode{hg-interdiff} + +With the \sfilename{hg-interdiff} program in your shell's search path, +you can run it as follows, from inside an MQ patch directory: +\begin{codesample2} + hg extdiff -p hg-interdiff -r A:B my-change.patch +\end{codesample2} +Since you'll probably want to use this long-winded command a lot, you +can get \hgext{hgext} to make it available as a normal Mercurial +command, again by editing your \hgrc. +\begin{codesample2} + [extdiff] + cmd.interdiff = hg-interdiff +\end{codesample2} +This directs \hgext{hgext} to make an \texttt{interdiff} command +available, so you can now shorten the previous invocation of +\hgxcmd{extdiff}{extdiff} to something a little more wieldy. +\begin{codesample2} + hg interdiff -r A:B my-change.patch +\end{codesample2} + +\begin{note} + The \command{interdiff} command works well only if the underlying + files against which versions of a patch are generated remain the + same. If you create a patch, modify the underlying files, and then + regenerate the patch, \command{interdiff} may not produce useful + output. +\end{note} + +The \hgext{extdiff} extension is useful for more than merely improving +the presentation of MQ~patches. To read more about it, go to +section~\ref{sec:hgext:extdiff}. + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "00book" +%%% End: