Mercurial > hgbook
diff en/ch08-undo.xml @ 776:019040fbf5f5
merged to upstream: phase 1
author | Yoshiki Yazawa <yaz@honeyplanet.jp> |
---|---|
date | Tue, 21 Apr 2009 00:36:40 +0900 |
parents | b338f5490029 |
children | 7226e5e750a6 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch08-undo.xml Tue Apr 21 00:36:40 2009 +0900 @@ -0,0 +1,1069 @@ +<!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : --> + +<chapter id="chap:undo"> + <?dbhtml filename="finding-and-fixing-mistakes.html"?> + <title>Finding and fixing mistakes</title> + + <para id="x_d2">To err might be human, but to really handle the consequences + well takes a top-notch revision control system. In this chapter, + we'll discuss some of the techniques you can use when you find + that a problem has crept into your project. Mercurial has some + highly capable features that will help you to isolate the sources + of problems, and to handle them appropriately.</para> + + <sect1> + <title>Erasing local history</title> + + <sect2> + <title>The accidental commit</title> + + <para id="x_d3">I have the occasional but persistent problem of typing + rather more quickly than I can think, which sometimes results + in me committing a changeset that is either incomplete or + plain wrong. In my case, the usual kind of incomplete + changeset is one in which I've created a new source file, but + forgotten to <command role="hg-cmd">hg add</command> it. A + <quote>plain wrong</quote> changeset is not as common, but no + less annoying.</para> + + </sect2> + <sect2 id="sec:undo:rollback"> + <title>Rolling back a transaction</title> + + <para id="x_d4">In <xref linkend="sec:concepts:txn"/>, I + mentioned that Mercurial treats each modification of a + repository as a <emphasis>transaction</emphasis>. Every time + you commit a changeset or pull changes from another + repository, Mercurial remembers what you did. You can undo, + or <emphasis>roll back</emphasis>, exactly one of these + actions using the <command role="hg-cmd">hg rollback</command> + command. (See <xref linkend="sec:undo:rollback-after-push"/> + for an important caveat about the use of this command.)</para> + + <para id="x_d5">Here's a mistake that I often find myself making: + committing a change in which I've created a new file, but + forgotten to <command role="hg-cmd">hg add</command> + it.</para> + + &interaction.rollback.commit; + + <para id="x_d6">Looking at the output of <command role="hg-cmd">hg + status</command> after the commit immediately confirms the + error.</para> + + &interaction.rollback.status; + + <para id="x_d7">The commit captured the changes to the file + <filename>a</filename>, but not the new file + <filename>b</filename>. If I were to push this changeset to a + repository that I shared with a colleague, the chances are + high that something in <filename>a</filename> would refer to + <filename>b</filename>, which would not be present in their + repository when they pulled my changes. I would thus become + the object of some indignation.</para> + + <para id="x_d8">However, luck is with me&emdash;I've caught my error + before I pushed the changeset. I use the <command + role="hg-cmd">hg rollback</command> command, and Mercurial + makes that last changeset vanish.</para> + + &interaction.rollback.rollback; + + <para id="x_d9">Notice that the changeset is no longer present in the + repository's history, and the working directory once again + thinks that the file <filename>a</filename> is modified. The + commit and rollback have left the working directory exactly as + it was prior to the commit; the changeset has been completely + erased. I can now safely <command role="hg-cmd">hg + add</command> the file <filename>b</filename>, and rerun my + commit.</para> + + &interaction.rollback.add; + + </sect2> + <sect2> + <title>The erroneous pull</title> + + <para id="x_da">It's common practice with Mercurial to maintain separate + development branches of a project in different repositories. + Your development team might have one shared repository for + your project's <quote>0.9</quote> release, and another, + containing different changes, for the <quote>1.0</quote> + release.</para> + + <para id="x_db">Given this, you can imagine that the consequences could be + messy if you had a local <quote>0.9</quote> repository, and + accidentally pulled changes from the shared <quote>1.0</quote> + repository into it. At worst, you could be paying + insufficient attention, and push those changes into the shared + <quote>0.9</quote> tree, confusing your entire team (but don't + worry, we'll return to this horror scenario later). However, + it's more likely that you'll notice immediately, because + Mercurial will display the URL it's pulling from, or you will + see it pull a suspiciously large number of changes into the + repository.</para> + + <para id="x_dc">The <command role="hg-cmd">hg rollback</command> command + will work nicely to expunge all of the changesets that you + just pulled. Mercurial groups all changes from one <command + role="hg-cmd">hg pull</command> into a single transaction, + so one <command role="hg-cmd">hg rollback</command> is all you + need to undo this mistake.</para> + + </sect2> + <sect2 id="sec:undo:rollback-after-push"> + <title>Rolling back is useless once you've pushed</title> + + <para id="x_dd">The value of the <command role="hg-cmd">hg + rollback</command> command drops to zero once you've pushed + your changes to another repository. Rolling back a change + makes it disappear entirely, but <emphasis>only</emphasis> in + the repository in which you perform the <command + role="hg-cmd">hg rollback</command>. Because a rollback + eliminates history, there's no way for the disappearance of a + change to propagate between repositories.</para> + + <para id="x_de">If you've pushed a change to another + repository&emdash;particularly if it's a shared + repository&emdash;it has essentially <quote>escaped into the + wild,</quote> and you'll have to recover from your mistake + in a different way. What will happen if you push a changeset + somewhere, then roll it back, then pull from the repository + you pushed to, is that the changeset will reappear in your + repository.</para> + + <para id="x_df">(If you absolutely know for sure that the change you want + to roll back is the most recent change in the repository that + you pushed to, <emphasis>and</emphasis> you know that nobody + else could have pulled it from that repository, you can roll + back the changeset there, too, but you really should really + not rely on this working reliably. If you do this, sooner or + later a change really will make it into a repository that you + don't directly control (or have forgotten about), and come + back to bite you.)</para> + + </sect2> + <sect2> + <title>You can only roll back once</title> + + <para id="x_e0">Mercurial stores exactly one transaction in its + transaction log; that transaction is the most recent one that + occurred in the repository. This means that you can only roll + back one transaction. If you expect to be able to roll back + one transaction, then its predecessor, this is not the + behavior you will get.</para> + + &interaction.rollback.twice; + + <para id="x_e1">Once you've rolled back one transaction in a repository, + you can't roll back again in that repository until you perform + another commit or pull.</para> + + </sect2> + </sect1> + <sect1> + <title>Reverting the mistaken change</title> + + <para id="x_e2">If you make a modification to a file, and decide that you + really didn't want to change the file at all, and you haven't + yet committed your changes, the <command role="hg-cmd">hg + revert</command> command is the one you'll need. It looks at + the changeset that's the parent of the working directory, and + restores the contents of the file to their state as of that + changeset. (That's a long-winded way of saying that, in the + normal case, it undoes your modifications.)</para> + + <para id="x_e3">Let's illustrate how the <command role="hg-cmd">hg + revert</command> command works with yet another small example. + We'll begin by modifying a file that Mercurial is already + tracking.</para> + + &interaction.daily.revert.modify; + + <para id="x_e4">If we don't + want that change, we can simply <command role="hg-cmd">hg + revert</command> the file.</para> + + &interaction.daily.revert.unmodify; + + <para id="x_e5">The <command role="hg-cmd">hg revert</command> command + provides us with an extra degree of safety by saving our + modified file with a <filename>.orig</filename> + extension.</para> + + &interaction.daily.revert.status; + + <para id="x_e6">Here is a summary of the cases that the <command + role="hg-cmd">hg revert</command> command can deal with. We + will describe each of these in more detail in the section that + follows.</para> + <itemizedlist> + <listitem><para id="x_e7">If you modify a file, it will restore the file + to its unmodified state.</para> + </listitem> + <listitem><para id="x_e8">If you <command role="hg-cmd">hg add</command> a + file, it will undo the <quote>added</quote> state of the + file, but leave the file itself untouched.</para> + </listitem> + <listitem><para id="x_e9">If you delete a file without telling Mercurial, + it will restore the file to its unmodified contents.</para> + </listitem> + <listitem><para id="x_ea">If you use the <command role="hg-cmd">hg + remove</command> command to remove a file, it will undo + the <quote>removed</quote> state of the file, and restore + the file to its unmodified contents.</para> + </listitem></itemizedlist> + + <sect2 id="sec:undo:mgmt"> + <title>File management errors</title> + + <para id="x_eb">The <command role="hg-cmd">hg revert</command> command is + useful for more than just modified files. It lets you reverse + the results of all of Mercurial's file management + commands&emdash;<command role="hg-cmd">hg add</command>, + <command role="hg-cmd">hg remove</command>, and so on.</para> + + <para id="x_ec">If you <command role="hg-cmd">hg add</command> a file, + then decide that in fact you don't want Mercurial to track it, + use <command role="hg-cmd">hg revert</command> to undo the + add. Don't worry; Mercurial will not modify the file in any + way. It will just <quote>unmark</quote> the file.</para> + + &interaction.daily.revert.add; + + <para id="x_ed">Similarly, if you ask Mercurial to <command + role="hg-cmd">hg remove</command> a file, you can use + <command role="hg-cmd">hg revert</command> to restore it to + the contents it had as of the parent of the working directory. + &interaction.daily.revert.remove; This works just as + well for a file that you deleted by hand, without telling + Mercurial (recall that in Mercurial terminology, this kind of + file is called <quote>missing</quote>).</para> + + &interaction.daily.revert.missing; + + <para id="x_ee">If you revert a <command role="hg-cmd">hg copy</command>, + the copied-to file remains in your working directory + afterwards, untracked. Since a copy doesn't affect the + copied-from file in any way, Mercurial doesn't do anything + with the copied-from file.</para> + + &interaction.daily.revert.copy; + + <sect3> + <title>A slightly special case: reverting a rename</title> + + <para id="x_ef">If you <command role="hg-cmd">hg rename</command> a + file, there is one small detail that you should remember. + When you <command role="hg-cmd">hg revert</command> a + rename, it's not enough to provide the name of the + renamed-to file, as you can see here.</para> + + &interaction.daily.revert.rename; + + <para id="x_f0">As you can see from the output of <command + role="hg-cmd">hg status</command>, the renamed-to file is + no longer identified as added, but the + renamed-<emphasis>from</emphasis> file is still removed! + This is counter-intuitive (at least to me), but at least + it's easy to deal with.</para> + + &interaction.daily.revert.rename-orig; + + <para id="x_f1">So remember, to revert a <command role="hg-cmd">hg + rename</command>, you must provide + <emphasis>both</emphasis> the source and destination + names.</para> + + <para id="x_f2">% TODO: the output doesn't look like it will be + removed!</para> + + <para id="x_f3">(By the way, if you rename a file, then modify the + renamed-to file, then revert both components of the rename, + when Mercurial restores the file that was removed as part of + the rename, it will be unmodified. If you need the + modifications in the renamed-to file to show up in the + renamed-from file, don't forget to copy them over.)</para> + + <para id="x_f4">These fiddly aspects of reverting a rename arguably + constitute a small bug in Mercurial.</para> + + </sect3> + </sect2> + </sect1> + <sect1> + <title>Dealing with committed changes</title> + + <para id="x_f5">Consider a case where you have committed a change $a$, and + another change $b$ on top of it; you then realise that change + $a$ was incorrect. Mercurial lets you <quote>back out</quote> + an entire changeset automatically, and building blocks that let + you reverse part of a changeset by hand.</para> + + <para id="x_f6">Before you read this section, here's something to + keep in mind: the <command role="hg-cmd">hg backout</command> + command undoes changes by <emphasis>adding</emphasis> history, + not by modifying or erasing it. It's the right tool to use if + you're fixing bugs, but not if you're trying to undo some change + that has catastrophic consequences. To deal with those, see + <xref linkend="sec:undo:aaaiiieee"/>.</para> + + <sect2> + <title>Backing out a changeset</title> + + <para id="x_f7">The <command role="hg-cmd">hg backout</command> command + lets you <quote>undo</quote> the effects of an entire + changeset in an automated fashion. Because Mercurial's + history is immutable, this command <emphasis>does + not</emphasis> get rid of the changeset you want to undo. + Instead, it creates a new changeset that + <emphasis>reverses</emphasis> the effect of the to-be-undone + changeset.</para> + + <para id="x_f8">The operation of the <command role="hg-cmd">hg + backout</command> command is a little intricate, so let's + illustrate it with some examples. First, we'll create a + repository with some simple changes.</para> + + &interaction.backout.init; + + <para id="x_f9">The <command role="hg-cmd">hg backout</command> command + takes a single changeset ID as its argument; this is the + changeset to back out. Normally, <command role="hg-cmd">hg + backout</command> will drop you into a text editor to write + a commit message, so you can record why you're backing the + change out. In this example, we provide a commit message on + the command line using the <option + role="hg-opt-backout">-m</option> option.</para> + + </sect2> + <sect2> + <title>Backing out the tip changeset</title> + + <para id="x_fa">We're going to start by backing out the last changeset we + committed.</para> + + &interaction.backout.simple; + + <para id="x_fb">You can see that the second line from + <filename>myfile</filename> is no longer present. Taking a + look at the output of <command role="hg-cmd">hg log</command> + gives us an idea of what the <command role="hg-cmd">hg + backout</command> command has done. + &interaction.backout.simple.log; Notice that the new changeset + that <command role="hg-cmd">hg backout</command> has created + is a child of the changeset we backed out. It's easier to see + this in <xref linkend="fig:undo:backout"/>, which presents a + graphical view of the change history. As you can see, the + history is nice and linear.</para> + + <figure id="fig:undo:backout"> + <title>Backing out a change using the <command + role="hg-cmd">hg backout</command> command</title> + <mediaobject> + <imageobject><imagedata fileref="figs/undo-simple.png"/></imageobject> + <textobject><phrase>XXX add text</phrase></textobject> + </mediaobject> + </figure> + + </sect2> + <sect2> + <title>Backing out a non-tip change</title> + + <para id="x_fd">If you want to back out a change other than the last one + you committed, pass the <option + role="hg-opt-backout">--merge</option> option to the + <command role="hg-cmd">hg backout</command> command.</para> + + &interaction.backout.non-tip.clone; + + <para id="x_fe">This makes backing out any changeset a + <quote>one-shot</quote> operation that's usually simple and + fast.</para> + + &interaction.backout.non-tip.backout; + + <para id="x_ff">If you take a look at the contents of + <filename>myfile</filename> after the backout finishes, you'll + see that the first and third changes are present, but not the + second.</para> + + &interaction.backout.non-tip.cat; + + <para id="x_100">As the graphical history in <xref + linkend="fig:undo:backout-non-tip"/> illustrates, Mercurial + actually commits <emphasis>two</emphasis> changes in this kind + of situation (the box-shaped nodes are the ones that Mercurial + commits automatically). Before Mercurial begins the backout + process, it first remembers what the current parent of the + working directory is. It then backs out the target changeset, + and commits that as a changeset. Finally, it merges back to + the previous parent of the working directory, and commits the + result of the merge.</para> + + <para id="x_101">% TODO: to me it looks like mercurial doesn't commit the + second merge automatically!</para> + + <figure id="fig:undo:backout-non-tip"> + <title>Automated backout of a non-tip change using the + <command role="hg-cmd">hg backout</command> command</title> + <mediaobject> + <imageobject><imagedata fileref="figs/undo-non-tip.png"/></imageobject> + <textobject><phrase>XXX add text</phrase></textobject> + </mediaobject> + </figure> + + <para id="x_103">The result is that you end up <quote>back where you + were</quote>, only with some extra history that undoes the + effect of the changeset you wanted to back out.</para> + + <sect3> + <title>Always use the <option + role="hg-opt-backout">--merge</option> option</title> + + <para id="x_104">In fact, since the <option + role="hg-opt-backout">--merge</option> option will do the + <quote>right thing</quote> whether or not the changeset + you're backing out is the tip (i.e. it won't try to merge if + it's backing out the tip, since there's no need), you should + <emphasis>always</emphasis> use this option when you run the + <command role="hg-cmd">hg backout</command> command.</para> + + </sect3> + </sect2> + <sect2> + <title>Gaining more control of the backout process</title> + + <para id="x_105">While I've recommended that you always use the <option + role="hg-opt-backout">--merge</option> option when backing + out a change, the <command role="hg-cmd">hg backout</command> + command lets you decide how to merge a backout changeset. + Taking control of the backout process by hand is something you + will rarely need to do, but it can be useful to understand + what the <command role="hg-cmd">hg backout</command> command + is doing for you automatically. To illustrate this, let's + clone our first repository, but omit the backout change that + it contains.</para> + + &interaction.backout.manual.clone; + + <para id="x_106">As with our + earlier example, We'll commit a third changeset, then back out + its parent, and see what happens.</para> + + &interaction.backout.manual.backout; + + <para id="x_107">Our new changeset is again a descendant of the changeset + we backout out; it's thus a new head, <emphasis>not</emphasis> + a descendant of the changeset that was the tip. The <command + role="hg-cmd">hg backout</command> command was quite + explicit in telling us this.</para> + + &interaction.backout.manual.log; + + <para id="x_108">Again, it's easier to see what has happened by looking at + a graph of the revision history, in <xref + linkend="fig:undo:backout-manual"/>. This makes it clear + that when we use <command role="hg-cmd">hg backout</command> + to back out a change other than the tip, Mercurial adds a new + head to the repository (the change it committed is + box-shaped).</para> + + <figure id="fig:undo:backout-manual"> + <title>Backing out a change using the <command + role="hg-cmd">hg backout</command> command</title> + <mediaobject> + <imageobject><imagedata fileref="figs/undo-manual.png"/></imageobject> + <textobject><phrase>XXX add text</phrase></textobject> + </mediaobject> + </figure> + + <para id="x_10a">After the <command role="hg-cmd">hg backout</command> + command has completed, it leaves the new + <quote>backout</quote> changeset as the parent of the working + directory.</para> + + &interaction.backout.manual.parents; + + <para id="x_10b">Now we have two isolated sets of changes.</para> + + &interaction.backout.manual.heads; + + <para id="x_10c">Let's think about what we expect to see as the contents of + <filename>myfile</filename> now. The first change should be + present, because we've never backed it out. The second change + should be missing, as that's the change we backed out. Since + the history graph shows the third change as a separate head, + we <emphasis>don't</emphasis> expect to see the third change + present in <filename>myfile</filename>.</para> + + &interaction.backout.manual.cat; + + <para id="x_10d">To get the third change back into the file, we just do a + normal merge of our two heads.</para> + + &interaction.backout.manual.merge; + + <para id="x_10e">Afterwards, the graphical history of our + repository looks like + <xref linkend="fig:undo:backout-manual-merge"/>.</para> + + <figure id="fig:undo:backout-manual-merge"> + <title>Manually merging a backout change</title> + <mediaobject> + <imageobject><imagedata fileref="figs/undo-manual-merge.png"/></imageobject> + <textobject><phrase>XXX add text</phrase></textobject> + </mediaobject> + </figure> + + </sect2> + <sect2> + <title>Why <command role="hg-cmd">hg backout</command> works as + it does</title> + + <para id="x_110">Here's a brief description of how the <command + role="hg-cmd">hg backout</command> command works.</para> + <orderedlist> + <listitem><para id="x_111">It ensures that the working directory is + <quote>clean</quote>, i.e. that the output of <command + role="hg-cmd">hg status</command> would be empty.</para> + </listitem> + <listitem><para id="x_112">It remembers the current parent of the working + directory. Let's call this changeset + <literal>orig</literal></para> + </listitem> + <listitem><para id="x_113">It does the equivalent of a <command + role="hg-cmd">hg update</command> to sync the working + directory to the changeset you want to back out. Let's + call this changeset <literal>backout</literal></para> + </listitem> + <listitem><para id="x_114">It finds the parent of that changeset. Let's + call that changeset <literal>parent</literal>.</para> + </listitem> + <listitem><para id="x_115">For each file that the + <literal>backout</literal> changeset affected, it does the + equivalent of a <command role="hg-cmd">hg revert -r + parent</command> on that file, to restore it to the + contents it had before that changeset was + committed.</para> + </listitem> + <listitem><para id="x_116">It commits the result as a new changeset. + This changeset has <literal>backout</literal> as its + parent.</para> + </listitem> + <listitem><para id="x_117">If you specify <option + role="hg-opt-backout">--merge</option> on the command + line, it merges with <literal>orig</literal>, and commits + the result of the merge.</para> + </listitem></orderedlist> + + <para id="x_118">An alternative way to implement the <command + role="hg-cmd">hg backout</command> command would be to + <command role="hg-cmd">hg export</command> the + to-be-backed-out changeset as a diff, then use the <option + role="cmd-opt-patch">--reverse</option> option to the + <command>patch</command> command to reverse the effect of the + change without fiddling with the working directory. This + sounds much simpler, but it would not work nearly as + well.</para> + + <para id="x_119">The reason that <command role="hg-cmd">hg + backout</command> does an update, a commit, a merge, and + another commit is to give the merge machinery the best chance + to do a good job when dealing with all the changes + <emphasis>between</emphasis> the change you're backing out and + the current tip.</para> + + <para id="x_11a">If you're backing out a changeset that's 100 revisions + back in your project's history, the chances that the + <command>patch</command> command will be able to apply a + reverse diff cleanly are not good, because intervening changes + are likely to have <quote>broken the context</quote> that + <command>patch</command> uses to determine whether it can + apply a patch (if this sounds like gibberish, see <xref + linkend="sec:mq:patch"/> for a + discussion of the <command>patch</command> command). Also, + Mercurial's merge machinery will handle files and directories + being renamed, permission changes, and modifications to binary + files, none of which <command>patch</command> can deal + with.</para> + + </sect2> + </sect1> + <sect1 id="sec:undo:aaaiiieee"> + <title>Changes that should never have been</title> + + <para id="x_11b">Most of the time, the <command role="hg-cmd">hg + backout</command> command is exactly what you need if you want + to undo the effects of a change. It leaves a permanent record + of exactly what you did, both when committing the original + changeset and when you cleaned up after it.</para> + + <para id="x_11c">On rare occasions, though, you may find that you've + committed a change that really should not be present in the + repository at all. For example, it would be very unusual, and + usually considered a mistake, to commit a software project's + object files as well as its source files. Object files have + almost no intrinsic value, and they're <emphasis>big</emphasis>, + so they increase the size of the repository and the amount of + time it takes to clone or pull changes.</para> + + <para id="x_11d">Before I discuss the options that you have if you commit a + <quote>brown paper bag</quote> change (the kind that's so bad + that you want to pull a brown paper bag over your head), let me + first discuss some approaches that probably won't work.</para> + + <para id="x_11e">Since Mercurial treats history as + accumulative&emdash;every change builds on top of all changes + that preceded it&emdash;you generally can't just make disastrous + changes disappear. The one exception is when you've just + committed a change, and it hasn't been pushed or pulled into + another repository. That's when you can safely use the <command + role="hg-cmd">hg rollback</command> command, as I detailed in + <xref linkend="sec:undo:rollback"/>.</para> + + <para id="x_11f">After you've pushed a bad change to another repository, you + <emphasis>could</emphasis> still use <command role="hg-cmd">hg + rollback</command> to make your local copy of the change + disappear, but it won't have the consequences you want. The + change will still be present in the remote repository, so it + will reappear in your local repository the next time you + pull.</para> + + <para id="x_120">If a situation like this arises, and you know which + repositories your bad change has propagated into, you can + <emphasis>try</emphasis> to get rid of the changeefrom + <emphasis>every</emphasis> one of those repositories. This is, + of course, not a satisfactory solution: if you miss even a + single repository while you're expunging, the change is still + <quote>in the wild</quote>, and could propagate further.</para> + + <para id="x_121">If you've committed one or more changes + <emphasis>after</emphasis> the change that you'd like to see + disappear, your options are further reduced. Mercurial doesn't + provide a way to <quote>punch a hole</quote> in history, leaving + changesets intact.</para> + + <para id="x_122">XXX This needs filling out. The + <literal>hg-replay</literal> script in the + <literal>examples</literal> directory works, but doesn't handle + merge changesets. Kind of an important omission.</para> + + <sect2> + <title>Protect yourself from <quote>escaped</quote> + changes</title> + + <para id="x_123">If you've committed some changes to your local repository + and they've been pushed or pulled somewhere else, this isn't + necessarily a disaster. You can protect yourself ahead of + time against some classes of bad changeset. This is + particularly easy if your team usually pulls changes from a + central repository.</para> + + <para id="x_124">By configuring some hooks on that repository to validate + incoming changesets (see chapter <xref linkend="chap:hook"/>), + you can + automatically prevent some kinds of bad changeset from being + pushed to the central repository at all. With such a + configuration in place, some kinds of bad changeset will + naturally tend to <quote>die out</quote> because they can't + propagate into the central repository. Better yet, this + happens without any need for explicit intervention.</para> + + <para id="x_125">For instance, an incoming change hook that verifies that a + changeset will actually compile can prevent people from + inadvertantly <quote>breaking the build</quote>.</para> + + </sect2> + </sect1> + <sect1 id="sec:undo:bisect"> + <title>Finding the source of a bug</title> + + <para id="x_126">While it's all very well to be able to back out a changeset + that introduced a bug, this requires that you know which + changeset to back out. Mercurial provides an invaluable + command, called <command role="hg-cmd">hg bisect</command>, that + helps you to automate this process and accomplish it very + efficiently.</para> + + <para id="x_127">The idea behind the <command role="hg-cmd">hg + bisect</command> command is that a changeset has introduced + some change of behavior that you can identify with a simple + binary test. You don't know which piece of code introduced the + change, but you know how to test for the presence of the bug. + The <command role="hg-cmd">hg bisect</command> command uses your + test to direct its search for the changeset that introduced the + code that caused the bug.</para> + + <para id="x_128">Here are a few scenarios to help you understand how you + might apply this command.</para> + <itemizedlist> + <listitem><para id="x_129">The most recent version of your software has a + bug that you remember wasn't present a few weeks ago, but + you don't know when it was introduced. Here, your binary + test checks for the presence of that bug.</para> + </listitem> + <listitem><para id="x_12a">You fixed a bug in a rush, and now it's time to + close the entry in your team's bug database. The bug + database requires a changeset ID when you close an entry, + but you don't remember which changeset you fixed the bug in. + Once again, your binary test checks for the presence of the + bug.</para> + </listitem> + <listitem><para id="x_12b">Your software works correctly, but runs 15% + slower than the last time you measured it. You want to know + which changeset introduced the performance regression. In + this case, your binary test measures the performance of your + software, to see whether it's <quote>fast</quote> or + <quote>slow</quote>.</para> + </listitem> + <listitem><para id="x_12c">The sizes of the components of your project that + you ship exploded recently, and you suspect that something + changed in the way you build your project.</para> + </listitem></itemizedlist> + + <para id="x_12d">From these examples, it should be clear that the <command + role="hg-cmd">hg bisect</command> command is not useful only + for finding the sources of bugs. You can use it to find any + <quote>emergent property</quote> of a repository (anything that + you can't find from a simple text search of the files in the + tree) for which you can write a binary test.</para> + + <para id="x_12e">We'll introduce a little bit of terminology here, just to + make it clear which parts of the search process are your + responsibility, and which are Mercurial's. A + <emphasis>test</emphasis> is something that + <emphasis>you</emphasis> run when <command role="hg-cmd">hg + bisect</command> chooses a changeset. A + <emphasis>probe</emphasis> is what <command role="hg-cmd">hg + bisect</command> runs to tell whether a revision is good. + Finally, we'll use the word <quote>bisect</quote>, as both a + noun and a verb, to stand in for the phrase <quote>search using + the <command role="hg-cmd">hg bisect</command> + command</quote>.</para> + + <para id="x_12f">One simple way to automate the searching process would be + simply to probe every changeset. However, this scales poorly. + If it took ten minutes to test a single changeset, and you had + 10,000 changesets in your repository, the exhaustive approach + would take on average 35 <emphasis>days</emphasis> to find the + changeset that introduced a bug. Even if you knew that the bug + was introduced by one of the last 500 changesets, and limited + your search to those, you'd still be looking at over 40 hours to + find the changeset that introduced your bug.</para> + + <para id="x_130">What the <command role="hg-cmd">hg bisect</command> command + does is use its knowledge of the <quote>shape</quote> of your + project's revision history to perform a search in time + proportional to the <emphasis>logarithm</emphasis> of the number + of changesets to check (the kind of search it performs is called + a dichotomic search). With this approach, searching through + 10,000 changesets will take less than three hours, even at ten + minutes per test (the search will require about 14 tests). + Limit your search to the last hundred changesets, and it will + take only about an hour (roughly seven tests).</para> + + <para id="x_131">The <command role="hg-cmd">hg bisect</command> command is + aware of the <quote>branchy</quote> nature of a Mercurial + project's revision history, so it has no problems dealing with + branches, merges, or multiple heads in a repository. It can + prune entire branches of history with a single probe, which is + how it operates so efficiently.</para> + + <sect2> + <title>Using the <command role="hg-cmd">hg bisect</command> + command</title> + + <para id="x_132">Here's an example of <command role="hg-cmd">hg + bisect</command> in action.</para> + + <note> + <para id="x_133"> In versions 0.9.5 and earlier of Mercurial, <command + role="hg-cmd">hg bisect</command> was not a core command: + it was distributed with Mercurial as an extension. This + section describes the built-in command, not the old + extension.</para> + </note> + + <para id="x_134">Now let's create a repository, so that we can try out the + <command role="hg-cmd">hg bisect</command> command in + isolation.</para> + + &interaction.bisect.init; + + <para id="x_135">We'll simulate a project that has a bug in it in a + simple-minded way: create trivial changes in a loop, and + nominate one specific change that will have the + <quote>bug</quote>. This loop creates 35 changesets, each + adding a single file to the repository. We'll represent our + <quote>bug</quote> with a file that contains the text <quote>i + have a gub</quote>.</para> + + &interaction.bisect.commits; + + <para id="x_136">The next thing that we'd like to do is figure out how to + use the <command role="hg-cmd">hg bisect</command> command. + We can use Mercurial's normal built-in help mechanism for + this.</para> + + &interaction.bisect.help; + + <para id="x_137">The <command role="hg-cmd">hg bisect</command> command + works in steps. Each step proceeds as follows.</para> + <orderedlist> + <listitem><para id="x_138">You run your binary test.</para> + <itemizedlist> + <listitem><para id="x_139">If the test succeeded, you tell <command + role="hg-cmd">hg bisect</command> by running the + <command role="hg-cmd">hg bisect good</command> + command.</para> + </listitem> + <listitem><para id="x_13a">If it failed, run the <command + role="hg-cmd">hg bisect bad</command> + command.</para></listitem></itemizedlist> + </listitem> + <listitem><para id="x_13b">The command uses your information to decide + which changeset to test next.</para> + </listitem> + <listitem><para id="x_13c">It updates the working directory to that + changeset, and the process begins again.</para> + </listitem></orderedlist> + <para id="x_13d">The process ends when <command role="hg-cmd">hg + bisect</command> identifies a unique changeset that marks + the point where your test transitioned from + <quote>succeeding</quote> to <quote>failing</quote>.</para> + + <para id="x_13e">To start the search, we must run the <command + role="hg-cmd">hg bisect --reset</command> command.</para> + + &interaction.bisect.search.init; + + <para id="x_13f">In our case, the binary test we use is simple: we check to + see if any file in the repository contains the string <quote>i + have a gub</quote>. If it does, this changeset contains the + change that <quote>caused the bug</quote>. By convention, a + changeset that has the property we're searching for is + <quote>bad</quote>, while one that doesn't is + <quote>good</quote>.</para> + + <para id="x_140">Most of the time, the revision to which the working + directory is synced (usually the tip) already exhibits the + problem introduced by the buggy change, so we'll mark it as + <quote>bad</quote>.</para> + + &interaction.bisect.search.bad-init; + + <para id="x_141">Our next task is to nominate a changeset that we know + <emphasis>doesn't</emphasis> have the bug; the <command + role="hg-cmd">hg bisect</command> command will + <quote>bracket</quote> its search between the first pair of + good and bad changesets. In our case, we know that revision + 10 didn't have the bug. (I'll have more words about choosing + the first <quote>good</quote> changeset later.)</para> + + &interaction.bisect.search.good-init; + + <para id="x_142">Notice that this command printed some output.</para> + <itemizedlist> + <listitem><para id="x_143">It told us how many changesets it must + consider before it can identify the one that introduced + the bug, and how many tests that will require.</para> + </listitem> + <listitem><para id="x_144">It updated the working directory to the next + changeset to test, and told us which changeset it's + testing.</para> + </listitem></itemizedlist> + + <para id="x_145">We now run our test in the working directory. We use the + <command>grep</command> command to see if our + <quote>bad</quote> file is present in the working directory. + If it is, this revision is bad; if not, this revision is good. + &interaction.bisect.search.step1;</para> + + <para id="x_146">This test looks like a perfect candidate for automation, + so let's turn it into a shell function.</para> + &interaction.bisect.search.mytest; + + <para id="x_147">We can now run an entire test step with a single command, + <literal>mytest</literal>.</para> + + &interaction.bisect.search.step2; + + <para id="x_148">A few more invocations of our canned test step command, + and we're done.</para> + + &interaction.bisect.search.rest; + + <para id="x_149">Even though we had 40 changesets to search through, the + <command role="hg-cmd">hg bisect</command> command let us find + the changeset that introduced our <quote>bug</quote> with only + five tests. Because the number of tests that the <command + role="hg-cmd">hg bisect</command> command performs grows + logarithmically with the number of changesets to search, the + advantage that it has over the <quote>brute force</quote> + search approach increases with every changeset you add.</para> + + </sect2> + <sect2> + <title>Cleaning up after your search</title> + + <para id="x_14a">When you're finished using the <command role="hg-cmd">hg + bisect</command> command in a repository, you can use the + <command role="hg-cmd">hg bisect reset</command> command to + drop the information it was using to drive your search. The + command doesn't use much space, so it doesn't matter if you + forget to run this command. However, <command + role="hg-cmd">hg bisect</command> won't let you start a new + search in that repository until you do a <command + role="hg-cmd">hg bisect reset</command>.</para> + + &interaction.bisect.search.reset; + + </sect2> + </sect1> + <sect1> + <title>Tips for finding bugs effectively</title> + + <sect2> + <title>Give consistent input</title> + + <para id="x_14b">The <command role="hg-cmd">hg bisect</command> command + requires that you correctly report the result of every test + you perform. If you tell it that a test failed when it really + succeeded, it <emphasis>might</emphasis> be able to detect the + inconsistency. If it can identify an inconsistency in your + reports, it will tell you that a particular changeset is both + good and bad. However, it can't do this perfectly; it's about + as likely to report the wrong changeset as the source of the + bug.</para> + + </sect2> + <sect2> + <title>Automate as much as possible</title> + + <para id="x_14c">When I started using the <command role="hg-cmd">hg + bisect</command> command, I tried a few times to run my + tests by hand, on the command line. This is an approach that + I, at least, am not suited to. After a few tries, I found + that I was making enough mistakes that I was having to restart + my searches several times before finally getting correct + results.</para> + + <para id="x_14d">My initial problems with driving the <command + role="hg-cmd">hg bisect</command> command by hand occurred + even with simple searches on small repositories; if the + problem you're looking for is more subtle, or the number of + tests that <command role="hg-cmd">hg bisect</command> must + perform increases, the likelihood of operator error ruining + the search is much higher. Once I started automating my + tests, I had much better results.</para> + + <para id="x_14e">The key to automated testing is twofold:</para> + <itemizedlist> + <listitem><para id="x_14f">always test for the same symptom, and</para> + </listitem> + <listitem><para id="x_150">always feed consistent input to the <command + role="hg-cmd">hg bisect</command> command.</para> + </listitem></itemizedlist> + <para id="x_151">In my tutorial example above, the <command>grep</command> + command tests for the symptom, and the <literal>if</literal> + statement takes the result of this check and ensures that we + always feed the same input to the <command role="hg-cmd">hg + bisect</command> command. The <literal>mytest</literal> + function marries these together in a reproducible way, so that + every test is uniform and consistent.</para> + + </sect2> + <sect2> + <title>Check your results</title> + + <para id="x_152">Because the output of a <command role="hg-cmd">hg + bisect</command> search is only as good as the input you + give it, don't take the changeset it reports as the absolute + truth. A simple way to cross-check its report is to manually + run your test at each of the following changesets:</para> + <itemizedlist> + <listitem><para id="x_153">The changeset that it reports as the first bad + revision. Your test should still report this as + bad.</para> + </listitem> + <listitem><para id="x_154">The parent of that changeset (either parent, + if it's a merge). Your test should report this changeset + as good.</para> + </listitem> + <listitem><para id="x_155">A child of that changeset. Your test should + report this changeset as bad.</para> + </listitem></itemizedlist> + + </sect2> + <sect2> + <title>Beware interference between bugs</title> + + <para id="x_156">It's possible that your search for one bug could be + disrupted by the presence of another. For example, let's say + your software crashes at revision 100, and worked correctly at + revision 50. Unknown to you, someone else introduced a + different crashing bug at revision 60, and fixed it at + revision 80. This could distort your results in one of + several ways.</para> + + <para id="x_157">It is possible that this other bug completely + <quote>masks</quote> yours, which is to say that it occurs + before your bug has a chance to manifest itself. If you can't + avoid that other bug (for example, it prevents your project + from building), and so can't tell whether your bug is present + in a particular changeset, the <command role="hg-cmd">hg + bisect</command> command cannot help you directly. Instead, + you can mark a changeset as untested by running <command + role="hg-cmd">hg bisect --skip</command>.</para> + + <para id="x_158">A different problem could arise if your test for a bug's + presence is not specific enough. If you check for <quote>my + program crashes</quote>, then both your crashing bug and an + unrelated crashing bug that masks it will look like the same + thing, and mislead <command role="hg-cmd">hg + bisect</command>.</para> + + <para id="x_159">Another useful situation in which to use <command + role="hg-cmd">hg bisect --skip</command> is if you can't + test a revision because your project was in a broken and hence + untestable state at that revision, perhaps because someone + checked in a change that prevented the project from + building.</para> + + </sect2> + <sect2> + <title>Bracket your search lazily</title> + + <para id="x_15a">Choosing the first <quote>good</quote> and + <quote>bad</quote> changesets that will mark the end points of + your search is often easy, but it bears a little discussion + nevertheless. From the perspective of <command + role="hg-cmd">hg bisect</command>, the <quote>newest</quote> + changeset is conventionally <quote>bad</quote>, and the older + changeset is <quote>good</quote>.</para> + + <para id="x_15b">If you're having trouble remembering when a suitable + <quote>good</quote> change was, so that you can tell <command + role="hg-cmd">hg bisect</command>, you could do worse than + testing changesets at random. Just remember to eliminate + contenders that can't possibly exhibit the bug (perhaps + because the feature with the bug isn't present yet) and those + where another problem masks the bug (as I discussed + above).</para> + + <para id="x_15c">Even if you end up <quote>early</quote> by thousands of + changesets or months of history, you will only add a handful + of tests to the total number that <command role="hg-cmd">hg + bisect</command> must perform, thanks to its logarithmic + behavior.</para> + + </sect2> + </sect1> +</chapter> + +<!-- +local variables: +sgml-parent-document: ("00book.xml" "book" "chapter") +end: +-->