diff en/ch13-mq-collab.xml @ 658:b90b024729f1

WIP DocBook snapshot that all compiles. Mirabile dictu!
author Bryan O'Sullivan <bos@serpentine.com>
date Wed, 18 Feb 2009 00:22:09 -0800
parents en/ch13-mq-collab.tex@f72b7e6cbe90
children 8fcd44708f41
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/en/ch13-mq-collab.xml	Wed Feb 18 00:22:09 2009 -0800
@@ -0,0 +1,496 @@
+<!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
+
+<chapter id="chap:mq-collab">
+  <title>Advanced uses of Mercurial Queues</title>
+
+  <para>While it's easy to pick up straightforward uses of Mercurial
+    Queues, use of a little discipline and some of MQ's less
+    frequently used capabilities makes it possible to work in
+    complicated development environments.</para>
+
+  <para>In this chapter, I will use as an example a technique I have
+    used to manage the development of an Infiniband device driver for
+    the Linux kernel.  The driver in question is large (at least as
+    drivers go), with 25,000 lines of code spread across 35 source
+    files.  It is maintained by a small team of developers.</para>
+
+  <para>While much of the material in this chapter is specific to
+    Linux, the same principles apply to any code base for which you're
+    not the primary owner, and upon which you need to do a lot of
+    development.</para>
+
+  <sect1>
+    <title>The problem of many targets</title>
+
+    <para>The Linux kernel changes rapidly, and has never been
+      internally stable; developers frequently make drastic changes
+      between releases. This means that a version of the driver that
+      works well with a particular released version of the kernel will
+      not even <emphasis>compile</emphasis> correctly against,
+      typically, any other version.</para>
+
+    <para>To maintain a driver, we have to keep a number of distinct
+      versions of Linux in mind.</para>
+    <itemizedlist>
+      <listitem><para>One target is the main Linux kernel development
+	  tree. Maintenance of the code is in this case partly shared
+	  by other developers in the kernel community, who make
+	  <quote>drive-by</quote> modifications to the driver as they
+	  develop and refine kernel subsystems.</para>
+      </listitem>
+      <listitem><para>We also maintain a number of
+	  <quote>backports</quote> to older versions of the Linux
+	  kernel, to support the needs of customers who are running
+	  older Linux distributions that do not incorporate our
+	  drivers.  (To <emphasis>backport</emphasis> a piece of code
+	  is to modify it to work in an older version of its target
+	  environment than the version it was developed for.)</para>
+      </listitem>
+      <listitem><para>Finally, we make software releases on a schedule
+	  that is necessarily not aligned with those used by Linux
+	  distributors and kernel developers, so that we can deliver
+	  new features to customers without forcing them to upgrade
+	  their entire kernels or distributions.</para>
+      </listitem></itemizedlist>
+
+    <sect2>
+      <title>Tempting approaches that don't work well</title>
+
+      <para>There are two <quote>standard</quote> ways to maintain a
+	piece of software that has to target many different
+	environments.</para>
+
+      <para>The first is to maintain a number of branches, each
+	intended for a single target.  The trouble with this approach
+	is that you must maintain iron discipline in the flow of
+	changes between repositories. A new feature or bug fix must
+	start life in a <quote>pristine</quote> repository, then
+	percolate out to every backport repository.  Backport changes
+	are more limited in the branches they should propagate to; a
+	backport change that is applied to a branch where it doesn't
+	belong will probably stop the driver from compiling.</para>
+
+      <para>The second is to maintain a single source tree filled with
+	conditional statements that turn chunks of code on or off
+	depending on the intended target.  Because these
+	<quote>ifdefs</quote> are not allowed in the Linux kernel
+	tree, a manual or automatic process must be followed to strip
+	them out and yield a clean tree.  A code base maintained in
+	this fashion rapidly becomes a rat's nest of conditional
+	blocks that are difficult to understand and maintain.</para>
+
+      <para>Neither of these approaches is well suited to a situation
+	where you don't <quote>own</quote> the canonical copy of a
+	source tree.  In the case of a Linux driver that is
+	distributed with the standard kernel, Linus's tree contains
+	the copy of the code that will be treated by the world as
+	canonical.  The upstream version of <quote>my</quote> driver
+	can be modified by people I don't know, without me even
+	finding out about it until after the changes show up in
+	Linus's tree.</para>
+
+      <para>These approaches have the added weakness of making it
+	difficult to generate well-formed patches to submit
+	upstream.</para>
+
+      <para>In principle, Mercurial Queues seems like a good candidate
+	to manage a development scenario such as the above.  While
+	this is indeed the case, MQ contains a few added features that
+	make the job more pleasant.</para>
+
+    </sect2>
+  </sect1>
+  <sect1>
+    <title>Conditionally applying patches with guards</title>
+
+    <para>Perhaps the best way to maintain sanity with so many targets
+      is to be able to choose specific patches to apply for a given
+      situation.  MQ provides a feature called <quote>guards</quote>
+      (which originates with quilt's <literal>guards</literal>
+      command) that does just this.  To start off, let's create a
+      simple repository for experimenting in. <!--
+      &interaction.mq.guards.init; --> This gives us a tiny repository
+      that contains two patches that don't have any dependencies on
+      each other, because they touch different files.</para>
+
+    <para>The idea behind conditional application is that you can
+      <quote>tag</quote> a patch with a <emphasis>guard</emphasis>,
+      which is simply a text string of your choosing, then tell MQ to
+      select specific guards to use when applying patches.  MQ will
+      then either apply, or skip over, a guarded patch, depending on
+      the guards that you have selected.</para>
+
+    <para>A patch can have an arbitrary number of guards; each one is
+      <emphasis>positive</emphasis> (<quote>apply this patch if this
+	guard is selected</quote>) or <emphasis>negative</emphasis>
+      (<quote>skip this patch if this guard is selected</quote>).  A
+      patch with no guards is always applied.</para>
+
+  </sect1>
+  <sect1>
+    <title>Controlling the guards on a patch</title>
+
+    <para>The <command role="hg-ext-mq">qguard</command> command lets
+      you determine which guards should apply to a patch, or display
+      the guards that are already in effect. Without any arguments, it
+      displays the guards on the current topmost patch. <!--
+      &interaction.mq.guards.qguard; --> To set a positive guard on a
+      patch, prefix the name of the guard with a
+      <quote><literal>+</literal></quote>. <!--
+      &interaction.mq.guards.qguard.pos; --> To set a negative guard
+      on a patch, prefix the name of the guard with a
+      <quote><literal>-</literal></quote>. <!--
+      &interaction.mq.guards.qguard.neg; --></para>
+
+    <note>
+      <para>  The <command role="hg-ext-mq">qguard</command> command
+	<emphasis>sets</emphasis> the guards on a patch; it doesn't
+	<emphasis>modify</emphasis> them.  What this means is that if
+	you run <command role="hg-cmd">hg qguard +a +b</command> on a
+	patch, then <command role="hg-cmd">hg qguard +c</command> on
+	the same patch, the <emphasis>only</emphasis> guard that will
+	be set on it afterwards is <literal>+c</literal>.</para>
+    </note>
+
+    <para>Mercurial stores guards in the <filename
+	role="special">series</filename> file; the form in which they
+      are stored is easy both to understand and to edit by hand. (In
+      other words, you don't have to use the <command
+	role="hg-ext-mq">qguard</command> command if you don't want
+      to; it's okay to simply edit the <filename
+	role="special">series</filename> file.) <!--
+      &interaction.mq.guards.series; --></para>
+
+  </sect1>
+  <sect1>
+    <title>Selecting the guards to use</title>
+
+    <para>The <command role="hg-ext-mq">qselect</command> command
+      determines which guards are active at a given time.  The effect
+      of this is to determine which patches MQ will apply the next
+      time you run <command role="hg-ext-mq">qpush</command>.  It has
+      no other effect; in particular, it doesn't do anything to
+      patches that are already applied.</para>
+
+    <para>With no arguments, the <command
+	role="hg-ext-mq">qselect</command> command lists the guards
+      currently in effect, one per line of output.  Each argument is
+      treated as the name of a guard to apply. <!--
+      &interaction.mq.guards.qselect.foo; --> In case you're
+      interested, the currently selected guards are stored in the
+      <filename role="special">guards</filename> file. <!--
+      &interaction.mq.guards.qselect.cat; --> We can see the effect
+      the selected guards have when we run <command
+	role="hg-ext-mq">qpush</command>. <!--
+      &interaction.mq.guards.qselect.qpush; --></para>
+
+    <para>A guard cannot start with a
+      <quote><literal>+</literal></quote> or
+      <quote><literal>-</literal></quote> character.  The name of a
+      guard must not contain white space, but most other characters
+      are acceptable.  If you try to use a guard with an invalid name,
+      MQ will complain: <!-- &interaction.mq.guards.qselect.error; -->
+      Changing the selected guards changes the patches that are
+      applied. <!-- &interaction.mq.guards.qselect.quux; --> You can
+      see in the example below that negative guards take precedence
+      over positive guards. <!--
+      &interaction.mq.guards.qselect.foobar; --></para>
+
+  </sect1>
+  <sect1>
+    <title>MQ's rules for applying patches</title>
+
+    <para>The rules that MQ uses when deciding whether to apply a
+      patch are as follows.</para>
+    <itemizedlist>
+      <listitem><para>A patch that has no guards is always
+	  applied.</para>
+      </listitem>
+      <listitem><para>If the patch has any negative guard that matches
+	  any currently selected guard, the patch is skipped.</para>
+      </listitem>
+      <listitem><para>If the patch has any positive guard that matches
+	  any currently selected guard, the patch is applied.</para>
+      </listitem>
+      <listitem><para>If the patch has positive or negative guards,
+	  but none matches any currently selected guard, the patch is
+	  skipped.</para>
+      </listitem></itemizedlist>
+
+  </sect1>
+  <sect1>
+    <title>Trimming the work environment</title>
+
+    <para>In working on the device driver I mentioned earlier, I don't
+      apply the patches to a normal Linux kernel tree.  Instead, I use
+      a repository that contains only a snapshot of the source files
+      and headers that are relevant to Infiniband development.  This
+      repository is 1% the size of a kernel repository, so it's easier
+      to work with.</para>
+
+    <para>I then choose a <quote>base</quote> version on top of which
+      the patches are applied.  This is a snapshot of the Linux kernel
+      tree as of a revision of my choosing.  When I take the snapshot,
+      I record the changeset ID from the kernel repository in the
+      commit message.  Since the snapshot preserves the
+      <quote>shape</quote> and content of the relevant parts of the
+      kernel tree, I can apply my patches on top of either my tiny
+      repository or a normal kernel tree.</para>
+
+    <para>Normally, the base tree atop which the patches apply should
+      be a snapshot of a very recent upstream tree.  This best
+      facilitates the development of patches that can easily be
+      submitted upstream with few or no modifications.</para>
+
+  </sect1>
+  <sect1>
+    <title>Dividing up the <filename role="special">series</filename>
+      file</title>
+
+    <para>I categorise the patches in the <filename
+	role="special">series</filename> file into a number of logical
+      groups.  Each section of like patches begins with a block of
+      comments that describes the purpose of the patches that
+      follow.</para>
+
+    <para>The sequence of patch groups that I maintain follows.  The
+      ordering of these groups is important; I'll describe why after I
+      introduce the groups.</para>
+    <itemizedlist>
+      <listitem><para>The <quote>accepted</quote> group.  Patches that
+	  the development team has submitted to the maintainer of the
+	  Infiniband subsystem, and which he has accepted, but which
+	  are not present in the snapshot that the tiny repository is
+	  based on.  These are <quote>read only</quote> patches,
+	  present only to transform the tree into a similar state as
+	  it is in the upstream maintainer's repository.</para>
+      </listitem>
+      <listitem><para>The <quote>rework</quote> group.  Patches that I
+	  have submitted, but that the upstream maintainer has
+	  requested modifications to before he will accept
+	  them.</para>
+      </listitem>
+      <listitem><para>The <quote>pending</quote> group.  Patches that
+	  I have not yet submitted to the upstream maintainer, but
+	  which we have finished working on. These will be <quote>read
+	    only</quote> for a while.  If the upstream maintainer
+	  accepts them upon submission, I'll move them to the end of
+	  the <quote>accepted</quote> group.  If he requests that I
+	  modify any, I'll move them to the beginning of the
+	  <quote>rework</quote> group.</para>
+      </listitem>
+      <listitem><para>The <quote>in progress</quote> group.  Patches
+	  that are actively being developed, and should not be
+	  submitted anywhere yet.</para>
+      </listitem>
+      <listitem><para>The <quote>backport</quote> group.  Patches that
+	  adapt the source tree to older versions of the kernel
+	  tree.</para>
+      </listitem>
+      <listitem><para>The <quote>do not ship</quote> group.  Patches
+	  that for some reason should never be submitted upstream.
+	  For example, one such patch might change embedded driver
+	  identification strings to make it easier to distinguish, in
+	  the field, between an out-of-tree version of the driver and
+	  a version shipped by a distribution vendor.</para>
+      </listitem></itemizedlist>
+
+    <para>Now to return to the reasons for ordering groups of patches
+      in this way.  We would like the lowest patches in the stack to
+      be as stable as possible, so that we will not need to rework
+      higher patches due to changes in context.  Putting patches that
+      will never be changed first in the <filename
+	role="special">series</filename> file serves this
+      purpose.</para>
+
+    <para>We would also like the patches that we know we'll need to
+      modify to be applied on top of a source tree that resembles the
+      upstream tree as closely as possible.  This is why we keep
+      accepted patches around for a while.</para>
+
+    <para>The <quote>backport</quote> and <quote>do not ship</quote>
+      patches float at the end of the <filename
+	role="special">series</filename> file.  The backport patches
+      must be applied on top of all other patches, and the <quote>do
+	not ship</quote> patches might as well stay out of harm's
+      way.</para>
+
+  </sect1>
+  <sect1>
+    <title>Maintaining the patch series</title>
+
+    <para>In my work, I use a number of guards to control which
+      patches are to be applied.</para>
+
+    <itemizedlist>
+      <listitem><para><quote>Accepted</quote> patches are guarded with
+	  <literal>accepted</literal>.  I enable this guard most of
+	  the time.  When I'm applying the patches on top of a tree
+	  where the patches are already present, I can turn this patch
+	  off, and the patches that follow it will apply
+	  cleanly.</para>
+      </listitem>
+      <listitem><para>Patches that are <quote>finished</quote>, but
+	  not yet submitted, have no guards.  If I'm applying the
+	  patch stack to a copy of the upstream tree, I don't need to
+	  enable any guards in order to get a reasonably safe source
+	  tree.</para>
+      </listitem>
+      <listitem><para>Those patches that need reworking before being
+	  resubmitted are guarded with
+	  <literal>rework</literal>.</para>
+      </listitem>
+      <listitem><para>For those patches that are still under
+	  development, I use <literal>devel</literal>.</para>
+      </listitem>
+      <listitem><para>A backport patch may have several guards, one
+	  for each version of the kernel to which it applies.  For
+	  example, a patch that backports a piece of code to 2.6.9
+	  will have a <literal>2.6.9</literal> guard.</para>
+      </listitem></itemizedlist>
+    <para>This variety of guards gives me considerable flexibility in
+      determining what kind of source tree I want to end up with.  For
+      most situations, the selection of appropriate guards is
+      automated during the build process, but I can manually tune the
+      guards to use for less common circumstances.</para>
+
+    <sect2>
+      <title>The art of writing backport patches</title>
+
+      <para>Using MQ, writing a backport patch is a simple process.
+	All such a patch has to do is modify a piece of code that uses
+	a kernel feature not present in the older version of the
+	kernel, so that the driver continues to work correctly under
+	that older version.</para>
+
+      <para>A useful goal when writing a good backport patch is to
+	make your code look as if it was written for the older version
+	of the kernel you're targeting.  The less obtrusive the patch,
+	the easier it will be to understand and maintain.  If you're
+	writing a collection of backport patches to avoid the
+	<quote>rat's nest</quote> effect of lots of
+	<literal>#ifdef</literal>s (hunks of source code that are only
+	used conditionally) in your code, don't introduce
+	version-dependent <literal>#ifdef</literal>s into the patches.
+	Instead, write several patches, each of which makes
+	unconditional changes, and control their application using
+	guards.</para>
+
+      <para>There are two reasons to divide backport patches into a
+	distinct group, away from the <quote>regular</quote> patches
+	whose effects they modify. The first is that intermingling the
+	two makes it more difficult to use a tool like the <literal
+	  role="hg-ext">patchbomb</literal> extension to automate the
+	process of submitting the patches to an upstream maintainer.
+	The second is that a backport patch could perturb the context
+	in which a subsequent regular patch is applied, making it
+	impossible to apply the regular patch cleanly
+	<emphasis>without</emphasis> the earlier backport patch
+	already being applied.</para>
+
+    </sect2>
+  </sect1>
+  <sect1>
+    <title>Useful tips for developing with MQ</title>
+
+    <sect2>
+      <title>Organising patches in directories</title>
+
+      <para>If you're working on a substantial project with MQ, it's
+	not difficult to accumulate a large number of patches.  For
+	example, I have one patch repository that contains over 250
+	patches.</para>
+
+      <para>If you can group these patches into separate logical
+	categories, you can if you like store them in different
+	directories; MQ has no problems with patch names that contain
+	path separators.</para>
+
+    </sect2>
+    <sect2 id="mq-collab:tips:interdiff">
+      <title>Viewing the history of a patch</title>
+
+      <para>If you're developing a set of patches over a long time,
+	it's a good idea to maintain them in a repository, as
+	discussed in section <xref linkend="sec:mq:repo"/>.  If you do
+	so, you'll quickly
+	discover that using the <command role="hg-cmd">hg
+	  diff</command> command to look at the history of changes to
+	a patch is unworkable.  This is in part because you're looking
+	at the second derivative of the real code (a diff of a diff),
+	but also because MQ adds noise to the process by modifying
+	time stamps and directory names when it updates a
+	patch.</para>
+
+      <para>However, you can use the <literal
+	  role="hg-ext">extdiff</literal> extension, which is bundled
+	with Mercurial, to turn a diff of two versions of a patch into
+	something readable.  To do this, you will need a third-party
+	package called <literal role="package">patchutils</literal>
+	<citation>web:patchutils</citation>.  This provides a command
+	named <command>interdiff</command>, which shows the
+	differences between two diffs as a diff.  Used on two versions
+	of the same diff, it generates a diff that represents the diff
+	from the first to the second version.</para>
+
+      <para>You can enable the <literal
+	  role="hg-ext">extdiff</literal> extension in the usual way,
+	by adding a line to the <literal
+	  role="rc-extensions">extensions</literal> section of your
+	<filename role="special"> /.hgrc</filename>.</para>
+      <programlisting>[extensions] extdiff =</programlisting>
+      <para>The <command>interdiff</command> command expects to be
+	passed the names of two files, but the <literal
+	  role="hg-ext">extdiff</literal> extension passes the program
+	it runs a pair of directories, each of which can contain an
+	arbitrary number of files.  We thus need a small program that
+	will run <command>interdiff</command> on each pair of files in
+	these two directories.  This program is available as <filename
+	  role="special">hg-interdiff</filename> in the <filename
+	  class="directory">examples</filename> directory of the
+	source code repository that accompanies this book. <!--
+	&example.hg-interdiff; --></para>
+
+      <para>With the <filename role="special">hg-interdiff</filename>
+	program in your shell's search path, you can run it as
+	follows, from inside an MQ patch directory:</para>
+      <programlisting>hg extdiff -p hg-interdiff -r A:B
+	my-change.patch</programlisting>
+      <para>Since you'll probably want to use this long-winded command
+	a lot, you can get <literal role="hg-ext">hgext</literal> to
+	make it available as a normal Mercurial command, again by
+	editing your <filename role="special">
+	  /.hgrc</filename>.</para>
+      <programlisting>[extdiff] cmd.interdiff =
+	hg-interdiff</programlisting>
+      <para>This directs <literal role="hg-ext">hgext</literal> to
+	make an <literal>interdiff</literal> command available, so you
+	can now shorten the previous invocation of <command
+	  role="hg-ext-extdiff">extdiff</command> to something a
+	little more wieldy.</para>
+      <programlisting>hg interdiff -r A:B
+	my-change.patch</programlisting>
+
+      <note>
+	<para>  The <command>interdiff</command> command works well
+	  only if the underlying files against which versions of a
+	  patch are generated remain the same.  If you create a patch,
+	  modify the underlying files, and then regenerate the patch,
+	  <command>interdiff</command> may not produce useful
+	  output.</para>
+      </note>
+
+      <para>The <literal role="hg-ext">extdiff</literal> extension is
+	useful for more than merely improving the presentation of MQ
+	patches.  To read more about it, go to section <xref
+	  linkend="sec:hgext:extdiff"/>.</para>
+
+    </sect2>
+  </sect1>
+</chapter>
+
+<!--
+local variables: 
+sgml-parent-document: ("00book.xml" "book" "chapter")
+end:
+-->