Mercurial > hgbook
view en/ch08-undo.xml @ 800:1a30d2627512
Propagate 2ff0a43f1152
Update ch03
author | Yoshiki Yazawa <yaz@honeyplanet.jp> |
---|---|
date | Thu, 18 Jun 2009 20:04:44 +0900 |
parents | b338f5490029 |
children | 7226e5e750a6 |
line wrap: on
line source
<!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : --> <chapter id="chap:undo"> <?dbhtml filename="finding-and-fixing-mistakes.html"?> <title>Finding and fixing mistakes</title> <para id="x_d2">To err might be human, but to really handle the consequences well takes a top-notch revision control system. In this chapter, we'll discuss some of the techniques you can use when you find that a problem has crept into your project. Mercurial has some highly capable features that will help you to isolate the sources of problems, and to handle them appropriately.</para> <sect1> <title>Erasing local history</title> <sect2> <title>The accidental commit</title> <para id="x_d3">I have the occasional but persistent problem of typing rather more quickly than I can think, which sometimes results in me committing a changeset that is either incomplete or plain wrong. In my case, the usual kind of incomplete changeset is one in which I've created a new source file, but forgotten to <command role="hg-cmd">hg add</command> it. A <quote>plain wrong</quote> changeset is not as common, but no less annoying.</para> </sect2> <sect2 id="sec:undo:rollback"> <title>Rolling back a transaction</title> <para id="x_d4">In <xref linkend="sec:concepts:txn"/>, I mentioned that Mercurial treats each modification of a repository as a <emphasis>transaction</emphasis>. Every time you commit a changeset or pull changes from another repository, Mercurial remembers what you did. You can undo, or <emphasis>roll back</emphasis>, exactly one of these actions using the <command role="hg-cmd">hg rollback</command> command. (See <xref linkend="sec:undo:rollback-after-push"/> for an important caveat about the use of this command.)</para> <para id="x_d5">Here's a mistake that I often find myself making: committing a change in which I've created a new file, but forgotten to <command role="hg-cmd">hg add</command> it.</para> &interaction.rollback.commit; <para id="x_d6">Looking at the output of <command role="hg-cmd">hg status</command> after the commit immediately confirms the error.</para> &interaction.rollback.status; <para id="x_d7">The commit captured the changes to the file <filename>a</filename>, but not the new file <filename>b</filename>. If I were to push this changeset to a repository that I shared with a colleague, the chances are high that something in <filename>a</filename> would refer to <filename>b</filename>, which would not be present in their repository when they pulled my changes. I would thus become the object of some indignation.</para> <para id="x_d8">However, luck is with me&emdash;I've caught my error before I pushed the changeset. I use the <command role="hg-cmd">hg rollback</command> command, and Mercurial makes that last changeset vanish.</para> &interaction.rollback.rollback; <para id="x_d9">Notice that the changeset is no longer present in the repository's history, and the working directory once again thinks that the file <filename>a</filename> is modified. The commit and rollback have left the working directory exactly as it was prior to the commit; the changeset has been completely erased. I can now safely <command role="hg-cmd">hg add</command> the file <filename>b</filename>, and rerun my commit.</para> &interaction.rollback.add; </sect2> <sect2> <title>The erroneous pull</title> <para id="x_da">It's common practice with Mercurial to maintain separate development branches of a project in different repositories. Your development team might have one shared repository for your project's <quote>0.9</quote> release, and another, containing different changes, for the <quote>1.0</quote> release.</para> <para id="x_db">Given this, you can imagine that the consequences could be messy if you had a local <quote>0.9</quote> repository, and accidentally pulled changes from the shared <quote>1.0</quote> repository into it. At worst, you could be paying insufficient attention, and push those changes into the shared <quote>0.9</quote> tree, confusing your entire team (but don't worry, we'll return to this horror scenario later). However, it's more likely that you'll notice immediately, because Mercurial will display the URL it's pulling from, or you will see it pull a suspiciously large number of changes into the repository.</para> <para id="x_dc">The <command role="hg-cmd">hg rollback</command> command will work nicely to expunge all of the changesets that you just pulled. Mercurial groups all changes from one <command role="hg-cmd">hg pull</command> into a single transaction, so one <command role="hg-cmd">hg rollback</command> is all you need to undo this mistake.</para> </sect2> <sect2 id="sec:undo:rollback-after-push"> <title>Rolling back is useless once you've pushed</title> <para id="x_dd">The value of the <command role="hg-cmd">hg rollback</command> command drops to zero once you've pushed your changes to another repository. Rolling back a change makes it disappear entirely, but <emphasis>only</emphasis> in the repository in which you perform the <command role="hg-cmd">hg rollback</command>. Because a rollback eliminates history, there's no way for the disappearance of a change to propagate between repositories.</para> <para id="x_de">If you've pushed a change to another repository&emdash;particularly if it's a shared repository&emdash;it has essentially <quote>escaped into the wild,</quote> and you'll have to recover from your mistake in a different way. What will happen if you push a changeset somewhere, then roll it back, then pull from the repository you pushed to, is that the changeset will reappear in your repository.</para> <para id="x_df">(If you absolutely know for sure that the change you want to roll back is the most recent change in the repository that you pushed to, <emphasis>and</emphasis> you know that nobody else could have pulled it from that repository, you can roll back the changeset there, too, but you really should really not rely on this working reliably. If you do this, sooner or later a change really will make it into a repository that you don't directly control (or have forgotten about), and come back to bite you.)</para> </sect2> <sect2> <title>You can only roll back once</title> <para id="x_e0">Mercurial stores exactly one transaction in its transaction log; that transaction is the most recent one that occurred in the repository. This means that you can only roll back one transaction. If you expect to be able to roll back one transaction, then its predecessor, this is not the behavior you will get.</para> &interaction.rollback.twice; <para id="x_e1">Once you've rolled back one transaction in a repository, you can't roll back again in that repository until you perform another commit or pull.</para> </sect2> </sect1> <sect1> <title>Reverting the mistaken change</title> <para id="x_e2">If you make a modification to a file, and decide that you really didn't want to change the file at all, and you haven't yet committed your changes, the <command role="hg-cmd">hg revert</command> command is the one you'll need. It looks at the changeset that's the parent of the working directory, and restores the contents of the file to their state as of that changeset. (That's a long-winded way of saying that, in the normal case, it undoes your modifications.)</para> <para id="x_e3">Let's illustrate how the <command role="hg-cmd">hg revert</command> command works with yet another small example. We'll begin by modifying a file that Mercurial is already tracking.</para> &interaction.daily.revert.modify; <para id="x_e4">If we don't want that change, we can simply <command role="hg-cmd">hg revert</command> the file.</para> &interaction.daily.revert.unmodify; <para id="x_e5">The <command role="hg-cmd">hg revert</command> command provides us with an extra degree of safety by saving our modified file with a <filename>.orig</filename> extension.</para> &interaction.daily.revert.status; <para id="x_e6">Here is a summary of the cases that the <command role="hg-cmd">hg revert</command> command can deal with. We will describe each of these in more detail in the section that follows.</para> <itemizedlist> <listitem><para id="x_e7">If you modify a file, it will restore the file to its unmodified state.</para> </listitem> <listitem><para id="x_e8">If you <command role="hg-cmd">hg add</command> a file, it will undo the <quote>added</quote> state of the file, but leave the file itself untouched.</para> </listitem> <listitem><para id="x_e9">If you delete a file without telling Mercurial, it will restore the file to its unmodified contents.</para> </listitem> <listitem><para id="x_ea">If you use the <command role="hg-cmd">hg remove</command> command to remove a file, it will undo the <quote>removed</quote> state of the file, and restore the file to its unmodified contents.</para> </listitem></itemizedlist> <sect2 id="sec:undo:mgmt"> <title>File management errors</title> <para id="x_eb">The <command role="hg-cmd">hg revert</command> command is useful for more than just modified files. It lets you reverse the results of all of Mercurial's file management commands&emdash;<command role="hg-cmd">hg add</command>, <command role="hg-cmd">hg remove</command>, and so on.</para> <para id="x_ec">If you <command role="hg-cmd">hg add</command> a file, then decide that in fact you don't want Mercurial to track it, use <command role="hg-cmd">hg revert</command> to undo the add. Don't worry; Mercurial will not modify the file in any way. It will just <quote>unmark</quote> the file.</para> &interaction.daily.revert.add; <para id="x_ed">Similarly, if you ask Mercurial to <command role="hg-cmd">hg remove</command> a file, you can use <command role="hg-cmd">hg revert</command> to restore it to the contents it had as of the parent of the working directory. &interaction.daily.revert.remove; This works just as well for a file that you deleted by hand, without telling Mercurial (recall that in Mercurial terminology, this kind of file is called <quote>missing</quote>).</para> &interaction.daily.revert.missing; <para id="x_ee">If you revert a <command role="hg-cmd">hg copy</command>, the copied-to file remains in your working directory afterwards, untracked. Since a copy doesn't affect the copied-from file in any way, Mercurial doesn't do anything with the copied-from file.</para> &interaction.daily.revert.copy; <sect3> <title>A slightly special case: reverting a rename</title> <para id="x_ef">If you <command role="hg-cmd">hg rename</command> a file, there is one small detail that you should remember. When you <command role="hg-cmd">hg revert</command> a rename, it's not enough to provide the name of the renamed-to file, as you can see here.</para> &interaction.daily.revert.rename; <para id="x_f0">As you can see from the output of <command role="hg-cmd">hg status</command>, the renamed-to file is no longer identified as added, but the renamed-<emphasis>from</emphasis> file is still removed! This is counter-intuitive (at least to me), but at least it's easy to deal with.</para> &interaction.daily.revert.rename-orig; <para id="x_f1">So remember, to revert a <command role="hg-cmd">hg rename</command>, you must provide <emphasis>both</emphasis> the source and destination names.</para> <para id="x_f2">% TODO: the output doesn't look like it will be removed!</para> <para id="x_f3">(By the way, if you rename a file, then modify the renamed-to file, then revert both components of the rename, when Mercurial restores the file that was removed as part of the rename, it will be unmodified. If you need the modifications in the renamed-to file to show up in the renamed-from file, don't forget to copy them over.)</para> <para id="x_f4">These fiddly aspects of reverting a rename arguably constitute a small bug in Mercurial.</para> </sect3> </sect2> </sect1> <sect1> <title>Dealing with committed changes</title> <para id="x_f5">Consider a case where you have committed a change $a$, and another change $b$ on top of it; you then realise that change $a$ was incorrect. Mercurial lets you <quote>back out</quote> an entire changeset automatically, and building blocks that let you reverse part of a changeset by hand.</para> <para id="x_f6">Before you read this section, here's something to keep in mind: the <command role="hg-cmd">hg backout</command> command undoes changes by <emphasis>adding</emphasis> history, not by modifying or erasing it. It's the right tool to use if you're fixing bugs, but not if you're trying to undo some change that has catastrophic consequences. To deal with those, see <xref linkend="sec:undo:aaaiiieee"/>.</para> <sect2> <title>Backing out a changeset</title> <para id="x_f7">The <command role="hg-cmd">hg backout</command> command lets you <quote>undo</quote> the effects of an entire changeset in an automated fashion. Because Mercurial's history is immutable, this command <emphasis>does not</emphasis> get rid of the changeset you want to undo. Instead, it creates a new changeset that <emphasis>reverses</emphasis> the effect of the to-be-undone changeset.</para> <para id="x_f8">The operation of the <command role="hg-cmd">hg backout</command> command is a little intricate, so let's illustrate it with some examples. First, we'll create a repository with some simple changes.</para> &interaction.backout.init; <para id="x_f9">The <command role="hg-cmd">hg backout</command> command takes a single changeset ID as its argument; this is the changeset to back out. Normally, <command role="hg-cmd">hg backout</command> will drop you into a text editor to write a commit message, so you can record why you're backing the change out. In this example, we provide a commit message on the command line using the <option role="hg-opt-backout">-m</option> option.</para> </sect2> <sect2> <title>Backing out the tip changeset</title> <para id="x_fa">We're going to start by backing out the last changeset we committed.</para> &interaction.backout.simple; <para id="x_fb">You can see that the second line from <filename>myfile</filename> is no longer present. Taking a look at the output of <command role="hg-cmd">hg log</command> gives us an idea of what the <command role="hg-cmd">hg backout</command> command has done. &interaction.backout.simple.log; Notice that the new changeset that <command role="hg-cmd">hg backout</command> has created is a child of the changeset we backed out. It's easier to see this in <xref linkend="fig:undo:backout"/>, which presents a graphical view of the change history. As you can see, the history is nice and linear.</para> <figure id="fig:undo:backout"> <title>Backing out a change using the <command role="hg-cmd">hg backout</command> command</title> <mediaobject> <imageobject><imagedata fileref="figs/undo-simple.png"/></imageobject> <textobject><phrase>XXX add text</phrase></textobject> </mediaobject> </figure> </sect2> <sect2> <title>Backing out a non-tip change</title> <para id="x_fd">If you want to back out a change other than the last one you committed, pass the <option role="hg-opt-backout">--merge</option> option to the <command role="hg-cmd">hg backout</command> command.</para> &interaction.backout.non-tip.clone; <para id="x_fe">This makes backing out any changeset a <quote>one-shot</quote> operation that's usually simple and fast.</para> &interaction.backout.non-tip.backout; <para id="x_ff">If you take a look at the contents of <filename>myfile</filename> after the backout finishes, you'll see that the first and third changes are present, but not the second.</para> &interaction.backout.non-tip.cat; <para id="x_100">As the graphical history in <xref linkend="fig:undo:backout-non-tip"/> illustrates, Mercurial actually commits <emphasis>two</emphasis> changes in this kind of situation (the box-shaped nodes are the ones that Mercurial commits automatically). Before Mercurial begins the backout process, it first remembers what the current parent of the working directory is. It then backs out the target changeset, and commits that as a changeset. Finally, it merges back to the previous parent of the working directory, and commits the result of the merge.</para> <para id="x_101">% TODO: to me it looks like mercurial doesn't commit the second merge automatically!</para> <figure id="fig:undo:backout-non-tip"> <title>Automated backout of a non-tip change using the <command role="hg-cmd">hg backout</command> command</title> <mediaobject> <imageobject><imagedata fileref="figs/undo-non-tip.png"/></imageobject> <textobject><phrase>XXX add text</phrase></textobject> </mediaobject> </figure> <para id="x_103">The result is that you end up <quote>back where you were</quote>, only with some extra history that undoes the effect of the changeset you wanted to back out.</para> <sect3> <title>Always use the <option role="hg-opt-backout">--merge</option> option</title> <para id="x_104">In fact, since the <option role="hg-opt-backout">--merge</option> option will do the <quote>right thing</quote> whether or not the changeset you're backing out is the tip (i.e. it won't try to merge if it's backing out the tip, since there's no need), you should <emphasis>always</emphasis> use this option when you run the <command role="hg-cmd">hg backout</command> command.</para> </sect3> </sect2> <sect2> <title>Gaining more control of the backout process</title> <para id="x_105">While I've recommended that you always use the <option role="hg-opt-backout">--merge</option> option when backing out a change, the <command role="hg-cmd">hg backout</command> command lets you decide how to merge a backout changeset. Taking control of the backout process by hand is something you will rarely need to do, but it can be useful to understand what the <command role="hg-cmd">hg backout</command> command is doing for you automatically. To illustrate this, let's clone our first repository, but omit the backout change that it contains.</para> &interaction.backout.manual.clone; <para id="x_106">As with our earlier example, We'll commit a third changeset, then back out its parent, and see what happens.</para> &interaction.backout.manual.backout; <para id="x_107">Our new changeset is again a descendant of the changeset we backout out; it's thus a new head, <emphasis>not</emphasis> a descendant of the changeset that was the tip. The <command role="hg-cmd">hg backout</command> command was quite explicit in telling us this.</para> &interaction.backout.manual.log; <para id="x_108">Again, it's easier to see what has happened by looking at a graph of the revision history, in <xref linkend="fig:undo:backout-manual"/>. This makes it clear that when we use <command role="hg-cmd">hg backout</command> to back out a change other than the tip, Mercurial adds a new head to the repository (the change it committed is box-shaped).</para> <figure id="fig:undo:backout-manual"> <title>Backing out a change using the <command role="hg-cmd">hg backout</command> command</title> <mediaobject> <imageobject><imagedata fileref="figs/undo-manual.png"/></imageobject> <textobject><phrase>XXX add text</phrase></textobject> </mediaobject> </figure> <para id="x_10a">After the <command role="hg-cmd">hg backout</command> command has completed, it leaves the new <quote>backout</quote> changeset as the parent of the working directory.</para> &interaction.backout.manual.parents; <para id="x_10b">Now we have two isolated sets of changes.</para> &interaction.backout.manual.heads; <para id="x_10c">Let's think about what we expect to see as the contents of <filename>myfile</filename> now. The first change should be present, because we've never backed it out. The second change should be missing, as that's the change we backed out. Since the history graph shows the third change as a separate head, we <emphasis>don't</emphasis> expect to see the third change present in <filename>myfile</filename>.</para> &interaction.backout.manual.cat; <para id="x_10d">To get the third change back into the file, we just do a normal merge of our two heads.</para> &interaction.backout.manual.merge; <para id="x_10e">Afterwards, the graphical history of our repository looks like <xref linkend="fig:undo:backout-manual-merge"/>.</para> <figure id="fig:undo:backout-manual-merge"> <title>Manually merging a backout change</title> <mediaobject> <imageobject><imagedata fileref="figs/undo-manual-merge.png"/></imageobject> <textobject><phrase>XXX add text</phrase></textobject> </mediaobject> </figure> </sect2> <sect2> <title>Why <command role="hg-cmd">hg backout</command> works as it does</title> <para id="x_110">Here's a brief description of how the <command role="hg-cmd">hg backout</command> command works.</para> <orderedlist> <listitem><para id="x_111">It ensures that the working directory is <quote>clean</quote>, i.e. that the output of <command role="hg-cmd">hg status</command> would be empty.</para> </listitem> <listitem><para id="x_112">It remembers the current parent of the working directory. Let's call this changeset <literal>orig</literal></para> </listitem> <listitem><para id="x_113">It does the equivalent of a <command role="hg-cmd">hg update</command> to sync the working directory to the changeset you want to back out. Let's call this changeset <literal>backout</literal></para> </listitem> <listitem><para id="x_114">It finds the parent of that changeset. Let's call that changeset <literal>parent</literal>.</para> </listitem> <listitem><para id="x_115">For each file that the <literal>backout</literal> changeset affected, it does the equivalent of a <command role="hg-cmd">hg revert -r parent</command> on that file, to restore it to the contents it had before that changeset was committed.</para> </listitem> <listitem><para id="x_116">It commits the result as a new changeset. This changeset has <literal>backout</literal> as its parent.</para> </listitem> <listitem><para id="x_117">If you specify <option role="hg-opt-backout">--merge</option> on the command line, it merges with <literal>orig</literal>, and commits the result of the merge.</para> </listitem></orderedlist> <para id="x_118">An alternative way to implement the <command role="hg-cmd">hg backout</command> command would be to <command role="hg-cmd">hg export</command> the to-be-backed-out changeset as a diff, then use the <option role="cmd-opt-patch">--reverse</option> option to the <command>patch</command> command to reverse the effect of the change without fiddling with the working directory. This sounds much simpler, but it would not work nearly as well.</para> <para id="x_119">The reason that <command role="hg-cmd">hg backout</command> does an update, a commit, a merge, and another commit is to give the merge machinery the best chance to do a good job when dealing with all the changes <emphasis>between</emphasis> the change you're backing out and the current tip.</para> <para id="x_11a">If you're backing out a changeset that's 100 revisions back in your project's history, the chances that the <command>patch</command> command will be able to apply a reverse diff cleanly are not good, because intervening changes are likely to have <quote>broken the context</quote> that <command>patch</command> uses to determine whether it can apply a patch (if this sounds like gibberish, see <xref linkend="sec:mq:patch"/> for a discussion of the <command>patch</command> command). Also, Mercurial's merge machinery will handle files and directories being renamed, permission changes, and modifications to binary files, none of which <command>patch</command> can deal with.</para> </sect2> </sect1> <sect1 id="sec:undo:aaaiiieee"> <title>Changes that should never have been</title> <para id="x_11b">Most of the time, the <command role="hg-cmd">hg backout</command> command is exactly what you need if you want to undo the effects of a change. It leaves a permanent record of exactly what you did, both when committing the original changeset and when you cleaned up after it.</para> <para id="x_11c">On rare occasions, though, you may find that you've committed a change that really should not be present in the repository at all. For example, it would be very unusual, and usually considered a mistake, to commit a software project's object files as well as its source files. Object files have almost no intrinsic value, and they're <emphasis>big</emphasis>, so they increase the size of the repository and the amount of time it takes to clone or pull changes.</para> <para id="x_11d">Before I discuss the options that you have if you commit a <quote>brown paper bag</quote> change (the kind that's so bad that you want to pull a brown paper bag over your head), let me first discuss some approaches that probably won't work.</para> <para id="x_11e">Since Mercurial treats history as accumulative&emdash;every change builds on top of all changes that preceded it&emdash;you generally can't just make disastrous changes disappear. The one exception is when you've just committed a change, and it hasn't been pushed or pulled into another repository. That's when you can safely use the <command role="hg-cmd">hg rollback</command> command, as I detailed in <xref linkend="sec:undo:rollback"/>.</para> <para id="x_11f">After you've pushed a bad change to another repository, you <emphasis>could</emphasis> still use <command role="hg-cmd">hg rollback</command> to make your local copy of the change disappear, but it won't have the consequences you want. The change will still be present in the remote repository, so it will reappear in your local repository the next time you pull.</para> <para id="x_120">If a situation like this arises, and you know which repositories your bad change has propagated into, you can <emphasis>try</emphasis> to get rid of the changeefrom <emphasis>every</emphasis> one of those repositories. This is, of course, not a satisfactory solution: if you miss even a single repository while you're expunging, the change is still <quote>in the wild</quote>, and could propagate further.</para> <para id="x_121">If you've committed one or more changes <emphasis>after</emphasis> the change that you'd like to see disappear, your options are further reduced. Mercurial doesn't provide a way to <quote>punch a hole</quote> in history, leaving changesets intact.</para> <para id="x_122">XXX This needs filling out. The <literal>hg-replay</literal> script in the <literal>examples</literal> directory works, but doesn't handle merge changesets. Kind of an important omission.</para> <sect2> <title>Protect yourself from <quote>escaped</quote> changes</title> <para id="x_123">If you've committed some changes to your local repository and they've been pushed or pulled somewhere else, this isn't necessarily a disaster. You can protect yourself ahead of time against some classes of bad changeset. This is particularly easy if your team usually pulls changes from a central repository.</para> <para id="x_124">By configuring some hooks on that repository to validate incoming changesets (see chapter <xref linkend="chap:hook"/>), you can automatically prevent some kinds of bad changeset from being pushed to the central repository at all. With such a configuration in place, some kinds of bad changeset will naturally tend to <quote>die out</quote> because they can't propagate into the central repository. Better yet, this happens without any need for explicit intervention.</para> <para id="x_125">For instance, an incoming change hook that verifies that a changeset will actually compile can prevent people from inadvertantly <quote>breaking the build</quote>.</para> </sect2> </sect1> <sect1 id="sec:undo:bisect"> <title>Finding the source of a bug</title> <para id="x_126">While it's all very well to be able to back out a changeset that introduced a bug, this requires that you know which changeset to back out. Mercurial provides an invaluable command, called <command role="hg-cmd">hg bisect</command>, that helps you to automate this process and accomplish it very efficiently.</para> <para id="x_127">The idea behind the <command role="hg-cmd">hg bisect</command> command is that a changeset has introduced some change of behavior that you can identify with a simple binary test. You don't know which piece of code introduced the change, but you know how to test for the presence of the bug. The <command role="hg-cmd">hg bisect</command> command uses your test to direct its search for the changeset that introduced the code that caused the bug.</para> <para id="x_128">Here are a few scenarios to help you understand how you might apply this command.</para> <itemizedlist> <listitem><para id="x_129">The most recent version of your software has a bug that you remember wasn't present a few weeks ago, but you don't know when it was introduced. Here, your binary test checks for the presence of that bug.</para> </listitem> <listitem><para id="x_12a">You fixed a bug in a rush, and now it's time to close the entry in your team's bug database. The bug database requires a changeset ID when you close an entry, but you don't remember which changeset you fixed the bug in. Once again, your binary test checks for the presence of the bug.</para> </listitem> <listitem><para id="x_12b">Your software works correctly, but runs 15% slower than the last time you measured it. You want to know which changeset introduced the performance regression. In this case, your binary test measures the performance of your software, to see whether it's <quote>fast</quote> or <quote>slow</quote>.</para> </listitem> <listitem><para id="x_12c">The sizes of the components of your project that you ship exploded recently, and you suspect that something changed in the way you build your project.</para> </listitem></itemizedlist> <para id="x_12d">From these examples, it should be clear that the <command role="hg-cmd">hg bisect</command> command is not useful only for finding the sources of bugs. You can use it to find any <quote>emergent property</quote> of a repository (anything that you can't find from a simple text search of the files in the tree) for which you can write a binary test.</para> <para id="x_12e">We'll introduce a little bit of terminology here, just to make it clear which parts of the search process are your responsibility, and which are Mercurial's. A <emphasis>test</emphasis> is something that <emphasis>you</emphasis> run when <command role="hg-cmd">hg bisect</command> chooses a changeset. A <emphasis>probe</emphasis> is what <command role="hg-cmd">hg bisect</command> runs to tell whether a revision is good. Finally, we'll use the word <quote>bisect</quote>, as both a noun and a verb, to stand in for the phrase <quote>search using the <command role="hg-cmd">hg bisect</command> command</quote>.</para> <para id="x_12f">One simple way to automate the searching process would be simply to probe every changeset. However, this scales poorly. If it took ten minutes to test a single changeset, and you had 10,000 changesets in your repository, the exhaustive approach would take on average 35 <emphasis>days</emphasis> to find the changeset that introduced a bug. Even if you knew that the bug was introduced by one of the last 500 changesets, and limited your search to those, you'd still be looking at over 40 hours to find the changeset that introduced your bug.</para> <para id="x_130">What the <command role="hg-cmd">hg bisect</command> command does is use its knowledge of the <quote>shape</quote> of your project's revision history to perform a search in time proportional to the <emphasis>logarithm</emphasis> of the number of changesets to check (the kind of search it performs is called a dichotomic search). With this approach, searching through 10,000 changesets will take less than three hours, even at ten minutes per test (the search will require about 14 tests). Limit your search to the last hundred changesets, and it will take only about an hour (roughly seven tests).</para> <para id="x_131">The <command role="hg-cmd">hg bisect</command> command is aware of the <quote>branchy</quote> nature of a Mercurial project's revision history, so it has no problems dealing with branches, merges, or multiple heads in a repository. It can prune entire branches of history with a single probe, which is how it operates so efficiently.</para> <sect2> <title>Using the <command role="hg-cmd">hg bisect</command> command</title> <para id="x_132">Here's an example of <command role="hg-cmd">hg bisect</command> in action.</para> <note> <para id="x_133"> In versions 0.9.5 and earlier of Mercurial, <command role="hg-cmd">hg bisect</command> was not a core command: it was distributed with Mercurial as an extension. This section describes the built-in command, not the old extension.</para> </note> <para id="x_134">Now let's create a repository, so that we can try out the <command role="hg-cmd">hg bisect</command> command in isolation.</para> &interaction.bisect.init; <para id="x_135">We'll simulate a project that has a bug in it in a simple-minded way: create trivial changes in a loop, and nominate one specific change that will have the <quote>bug</quote>. This loop creates 35 changesets, each adding a single file to the repository. We'll represent our <quote>bug</quote> with a file that contains the text <quote>i have a gub</quote>.</para> &interaction.bisect.commits; <para id="x_136">The next thing that we'd like to do is figure out how to use the <command role="hg-cmd">hg bisect</command> command. We can use Mercurial's normal built-in help mechanism for this.</para> &interaction.bisect.help; <para id="x_137">The <command role="hg-cmd">hg bisect</command> command works in steps. Each step proceeds as follows.</para> <orderedlist> <listitem><para id="x_138">You run your binary test.</para> <itemizedlist> <listitem><para id="x_139">If the test succeeded, you tell <command role="hg-cmd">hg bisect</command> by running the <command role="hg-cmd">hg bisect good</command> command.</para> </listitem> <listitem><para id="x_13a">If it failed, run the <command role="hg-cmd">hg bisect bad</command> command.</para></listitem></itemizedlist> </listitem> <listitem><para id="x_13b">The command uses your information to decide which changeset to test next.</para> </listitem> <listitem><para id="x_13c">It updates the working directory to that changeset, and the process begins again.</para> </listitem></orderedlist> <para id="x_13d">The process ends when <command role="hg-cmd">hg bisect</command> identifies a unique changeset that marks the point where your test transitioned from <quote>succeeding</quote> to <quote>failing</quote>.</para> <para id="x_13e">To start the search, we must run the <command role="hg-cmd">hg bisect --reset</command> command.</para> &interaction.bisect.search.init; <para id="x_13f">In our case, the binary test we use is simple: we check to see if any file in the repository contains the string <quote>i have a gub</quote>. If it does, this changeset contains the change that <quote>caused the bug</quote>. By convention, a changeset that has the property we're searching for is <quote>bad</quote>, while one that doesn't is <quote>good</quote>.</para> <para id="x_140">Most of the time, the revision to which the working directory is synced (usually the tip) already exhibits the problem introduced by the buggy change, so we'll mark it as <quote>bad</quote>.</para> &interaction.bisect.search.bad-init; <para id="x_141">Our next task is to nominate a changeset that we know <emphasis>doesn't</emphasis> have the bug; the <command role="hg-cmd">hg bisect</command> command will <quote>bracket</quote> its search between the first pair of good and bad changesets. In our case, we know that revision 10 didn't have the bug. (I'll have more words about choosing the first <quote>good</quote> changeset later.)</para> &interaction.bisect.search.good-init; <para id="x_142">Notice that this command printed some output.</para> <itemizedlist> <listitem><para id="x_143">It told us how many changesets it must consider before it can identify the one that introduced the bug, and how many tests that will require.</para> </listitem> <listitem><para id="x_144">It updated the working directory to the next changeset to test, and told us which changeset it's testing.</para> </listitem></itemizedlist> <para id="x_145">We now run our test in the working directory. We use the <command>grep</command> command to see if our <quote>bad</quote> file is present in the working directory. If it is, this revision is bad; if not, this revision is good. &interaction.bisect.search.step1;</para> <para id="x_146">This test looks like a perfect candidate for automation, so let's turn it into a shell function.</para> &interaction.bisect.search.mytest; <para id="x_147">We can now run an entire test step with a single command, <literal>mytest</literal>.</para> &interaction.bisect.search.step2; <para id="x_148">A few more invocations of our canned test step command, and we're done.</para> &interaction.bisect.search.rest; <para id="x_149">Even though we had 40 changesets to search through, the <command role="hg-cmd">hg bisect</command> command let us find the changeset that introduced our <quote>bug</quote> with only five tests. Because the number of tests that the <command role="hg-cmd">hg bisect</command> command performs grows logarithmically with the number of changesets to search, the advantage that it has over the <quote>brute force</quote> search approach increases with every changeset you add.</para> </sect2> <sect2> <title>Cleaning up after your search</title> <para id="x_14a">When you're finished using the <command role="hg-cmd">hg bisect</command> command in a repository, you can use the <command role="hg-cmd">hg bisect reset</command> command to drop the information it was using to drive your search. The command doesn't use much space, so it doesn't matter if you forget to run this command. However, <command role="hg-cmd">hg bisect</command> won't let you start a new search in that repository until you do a <command role="hg-cmd">hg bisect reset</command>.</para> &interaction.bisect.search.reset; </sect2> </sect1> <sect1> <title>Tips for finding bugs effectively</title> <sect2> <title>Give consistent input</title> <para id="x_14b">The <command role="hg-cmd">hg bisect</command> command requires that you correctly report the result of every test you perform. If you tell it that a test failed when it really succeeded, it <emphasis>might</emphasis> be able to detect the inconsistency. If it can identify an inconsistency in your reports, it will tell you that a particular changeset is both good and bad. However, it can't do this perfectly; it's about as likely to report the wrong changeset as the source of the bug.</para> </sect2> <sect2> <title>Automate as much as possible</title> <para id="x_14c">When I started using the <command role="hg-cmd">hg bisect</command> command, I tried a few times to run my tests by hand, on the command line. This is an approach that I, at least, am not suited to. After a few tries, I found that I was making enough mistakes that I was having to restart my searches several times before finally getting correct results.</para> <para id="x_14d">My initial problems with driving the <command role="hg-cmd">hg bisect</command> command by hand occurred even with simple searches on small repositories; if the problem you're looking for is more subtle, or the number of tests that <command role="hg-cmd">hg bisect</command> must perform increases, the likelihood of operator error ruining the search is much higher. Once I started automating my tests, I had much better results.</para> <para id="x_14e">The key to automated testing is twofold:</para> <itemizedlist> <listitem><para id="x_14f">always test for the same symptom, and</para> </listitem> <listitem><para id="x_150">always feed consistent input to the <command role="hg-cmd">hg bisect</command> command.</para> </listitem></itemizedlist> <para id="x_151">In my tutorial example above, the <command>grep</command> command tests for the symptom, and the <literal>if</literal> statement takes the result of this check and ensures that we always feed the same input to the <command role="hg-cmd">hg bisect</command> command. The <literal>mytest</literal> function marries these together in a reproducible way, so that every test is uniform and consistent.</para> </sect2> <sect2> <title>Check your results</title> <para id="x_152">Because the output of a <command role="hg-cmd">hg bisect</command> search is only as good as the input you give it, don't take the changeset it reports as the absolute truth. A simple way to cross-check its report is to manually run your test at each of the following changesets:</para> <itemizedlist> <listitem><para id="x_153">The changeset that it reports as the first bad revision. Your test should still report this as bad.</para> </listitem> <listitem><para id="x_154">The parent of that changeset (either parent, if it's a merge). Your test should report this changeset as good.</para> </listitem> <listitem><para id="x_155">A child of that changeset. Your test should report this changeset as bad.</para> </listitem></itemizedlist> </sect2> <sect2> <title>Beware interference between bugs</title> <para id="x_156">It's possible that your search for one bug could be disrupted by the presence of another. For example, let's say your software crashes at revision 100, and worked correctly at revision 50. Unknown to you, someone else introduced a different crashing bug at revision 60, and fixed it at revision 80. This could distort your results in one of several ways.</para> <para id="x_157">It is possible that this other bug completely <quote>masks</quote> yours, which is to say that it occurs before your bug has a chance to manifest itself. If you can't avoid that other bug (for example, it prevents your project from building), and so can't tell whether your bug is present in a particular changeset, the <command role="hg-cmd">hg bisect</command> command cannot help you directly. Instead, you can mark a changeset as untested by running <command role="hg-cmd">hg bisect --skip</command>.</para> <para id="x_158">A different problem could arise if your test for a bug's presence is not specific enough. If you check for <quote>my program crashes</quote>, then both your crashing bug and an unrelated crashing bug that masks it will look like the same thing, and mislead <command role="hg-cmd">hg bisect</command>.</para> <para id="x_159">Another useful situation in which to use <command role="hg-cmd">hg bisect --skip</command> is if you can't test a revision because your project was in a broken and hence untestable state at that revision, perhaps because someone checked in a change that prevented the project from building.</para> </sect2> <sect2> <title>Bracket your search lazily</title> <para id="x_15a">Choosing the first <quote>good</quote> and <quote>bad</quote> changesets that will mark the end points of your search is often easy, but it bears a little discussion nevertheless. From the perspective of <command role="hg-cmd">hg bisect</command>, the <quote>newest</quote> changeset is conventionally <quote>bad</quote>, and the older changeset is <quote>good</quote>.</para> <para id="x_15b">If you're having trouble remembering when a suitable <quote>good</quote> change was, so that you can tell <command role="hg-cmd">hg bisect</command>, you could do worse than testing changesets at random. Just remember to eliminate contenders that can't possibly exhibit the bug (perhaps because the feature with the bug isn't present yet) and those where another problem masks the bug (as I discussed above).</para> <para id="x_15c">Even if you end up <quote>early</quote> by thousands of changesets or months of history, you will only add a handful of tests to the total number that <command role="hg-cmd">hg bisect</command> must perform, thanks to its logarithmic behavior.</para> </sect2> </sect1> </chapter> <!-- local variables: sgml-parent-document: ("00book.xml" "book" "chapter") end: -->