comparison en/ch08-undo.xml @ 682:28b5a5befb08

Fold preface and intro into one
author Bryan O'Sullivan <bos@serpentine.com>
date Thu, 19 Mar 2009 20:54:12 -0700
parents en/ch09-undo.xml@13513d2a128d
children c838b3975bc6
comparison
equal deleted inserted replaced
681:5bfa0df6aaed 682:28b5a5befb08
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
2
3 <chapter id="chap:undo">
4 <?dbhtml filename="finding-and-fixing-mistakes.html"?>
5 <title>Finding and fixing mistakes</title>
6
7 <para>To err might be human, but to really handle the consequences
8 well takes a top-notch revision control system. In this chapter,
9 we'll discuss some of the techniques you can use when you find
10 that a problem has crept into your project. Mercurial has some
11 highly capable features that will help you to isolate the sources
12 of problems, and to handle them appropriately.</para>
13
14 <sect1>
15 <title>Erasing local history</title>
16
17 <sect2>
18 <title>The accidental commit</title>
19
20 <para>I have the occasional but persistent problem of typing
21 rather more quickly than I can think, which sometimes results
22 in me committing a changeset that is either incomplete or
23 plain wrong. In my case, the usual kind of incomplete
24 changeset is one in which I've created a new source file, but
25 forgotten to <command role="hg-cmd">hg add</command> it. A
26 <quote>plain wrong</quote> changeset is not as common, but no
27 less annoying.</para>
28
29 </sect2>
30 <sect2 id="sec:undo:rollback">
31 <title>Rolling back a transaction</title>
32
33 <para>In section <xref linkend="sec:concepts:txn"/>, I mentioned
34 that Mercurial treats each modification of a repository as a
35 <emphasis>transaction</emphasis>. Every time you commit a
36 changeset or pull changes from another repository, Mercurial
37 remembers what you did. You can undo, or <emphasis>roll
38 back</emphasis>, exactly one of these actions using the
39 <command role="hg-cmd">hg rollback</command> command. (See
40 section <xref linkend="sec:undo:rollback-after-push"/> for an
41 important caveat about the use of this command.)</para>
42
43 <para>Here's a mistake that I often find myself making:
44 committing a change in which I've created a new file, but
45 forgotten to <command role="hg-cmd">hg add</command>
46 it.</para>
47
48 &interaction.rollback.commit;
49
50 <para>Looking at the output of <command role="hg-cmd">hg
51 status</command> after the commit immediately confirms the
52 error.</para>
53
54 &interaction.rollback.status;
55
56 <para>The commit captured the changes to the file
57 <filename>a</filename>, but not the new file
58 <filename>b</filename>. If I were to push this changeset to a
59 repository that I shared with a colleague, the chances are
60 high that something in <filename>a</filename> would refer to
61 <filename>b</filename>, which would not be present in their
62 repository when they pulled my changes. I would thus become
63 the object of some indignation.</para>
64
65 <para>However, luck is with me&emdash;I've caught my error
66 before I pushed the changeset. I use the <command
67 role="hg-cmd">hg rollback</command> command, and Mercurial
68 makes that last changeset vanish.</para>
69
70 &interaction.rollback.rollback;
71
72 <para>Notice that the changeset is no longer present in the
73 repository's history, and the working directory once again
74 thinks that the file <filename>a</filename> is modified. The
75 commit and rollback have left the working directory exactly as
76 it was prior to the commit; the changeset has been completely
77 erased. I can now safely <command role="hg-cmd">hg
78 add</command> the file <filename>b</filename>, and rerun my
79 commit.</para>
80
81 &interaction.rollback.add;
82
83 </sect2>
84 <sect2>
85 <title>The erroneous pull</title>
86
87 <para>It's common practice with Mercurial to maintain separate
88 development branches of a project in different repositories.
89 Your development team might have one shared repository for
90 your project's <quote>0.9</quote> release, and another,
91 containing different changes, for the <quote>1.0</quote>
92 release.</para>
93
94 <para>Given this, you can imagine that the consequences could be
95 messy if you had a local <quote>0.9</quote> repository, and
96 accidentally pulled changes from the shared <quote>1.0</quote>
97 repository into it. At worst, you could be paying
98 insufficient attention, and push those changes into the shared
99 <quote>0.9</quote> tree, confusing your entire team (but don't
100 worry, we'll return to this horror scenario later). However,
101 it's more likely that you'll notice immediately, because
102 Mercurial will display the URL it's pulling from, or you will
103 see it pull a suspiciously large number of changes into the
104 repository.</para>
105
106 <para>The <command role="hg-cmd">hg rollback</command> command
107 will work nicely to expunge all of the changesets that you
108 just pulled. Mercurial groups all changes from one <command
109 role="hg-cmd">hg pull</command> into a single transaction,
110 so one <command role="hg-cmd">hg rollback</command> is all you
111 need to undo this mistake.</para>
112
113 </sect2>
114 <sect2 id="sec:undo:rollback-after-push">
115 <title>Rolling back is useless once you've pushed</title>
116
117 <para>The value of the <command role="hg-cmd">hg
118 rollback</command> command drops to zero once you've pushed
119 your changes to another repository. Rolling back a change
120 makes it disappear entirely, but <emphasis>only</emphasis> in
121 the repository in which you perform the <command
122 role="hg-cmd">hg rollback</command>. Because a rollback
123 eliminates history, there's no way for the disappearance of a
124 change to propagate between repositories.</para>
125
126 <para>If you've pushed a change to another
127 repository&emdash;particularly if it's a shared
128 repository&emdash;it has essentially <quote>escaped into the
129 wild,</quote> and you'll have to recover from your mistake
130 in a different way. What will happen if you push a changeset
131 somewhere, then roll it back, then pull from the repository
132 you pushed to, is that the changeset will reappear in your
133 repository.</para>
134
135 <para>(If you absolutely know for sure that the change you want
136 to roll back is the most recent change in the repository that
137 you pushed to, <emphasis>and</emphasis> you know that nobody
138 else could have pulled it from that repository, you can roll
139 back the changeset there, too, but you really should really
140 not rely on this working reliably. If you do this, sooner or
141 later a change really will make it into a repository that you
142 don't directly control (or have forgotten about), and come
143 back to bite you.)</para>
144
145 </sect2>
146 <sect2>
147 <title>You can only roll back once</title>
148
149 <para>Mercurial stores exactly one transaction in its
150 transaction log; that transaction is the most recent one that
151 occurred in the repository. This means that you can only roll
152 back one transaction. If you expect to be able to roll back
153 one transaction, then its predecessor, this is not the
154 behaviour you will get.</para>
155
156 &interaction.rollback.twice;
157
158 <para>Once you've rolled back one transaction in a repository,
159 you can't roll back again in that repository until you perform
160 another commit or pull.</para>
161
162 </sect2>
163 </sect1>
164 <sect1>
165 <title>Reverting the mistaken change</title>
166
167 <para>If you make a modification to a file, and decide that you
168 really didn't want to change the file at all, and you haven't
169 yet committed your changes, the <command role="hg-cmd">hg
170 revert</command> command is the one you'll need. It looks at
171 the changeset that's the parent of the working directory, and
172 restores the contents of the file to their state as of that
173 changeset. (That's a long-winded way of saying that, in the
174 normal case, it undoes your modifications.)</para>
175
176 <para>Let's illustrate how the <command role="hg-cmd">hg
177 revert</command> command works with yet another small example.
178 We'll begin by modifying a file that Mercurial is already
179 tracking.</para>
180
181 &interaction.daily.revert.modify;
182
183 <para>If we don't
184 want that change, we can simply <command role="hg-cmd">hg
185 revert</command> the file.</para>
186
187 &interaction.daily.revert.unmodify;
188
189 <para>The <command role="hg-cmd">hg revert</command> command
190 provides us with an extra degree of safety by saving our
191 modified file with a <filename>.orig</filename>
192 extension.</para>
193
194 &interaction.daily.revert.status;
195
196 <para>Here is a summary of the cases that the <command
197 role="hg-cmd">hg revert</command> command can deal with. We
198 will describe each of these in more detail in the section that
199 follows.</para>
200 <itemizedlist>
201 <listitem><para>If you modify a file, it will restore the file
202 to its unmodified state.</para>
203 </listitem>
204 <listitem><para>If you <command role="hg-cmd">hg add</command> a
205 file, it will undo the <quote>added</quote> state of the
206 file, but leave the file itself untouched.</para>
207 </listitem>
208 <listitem><para>If you delete a file without telling Mercurial,
209 it will restore the file to its unmodified contents.</para>
210 </listitem>
211 <listitem><para>If you use the <command role="hg-cmd">hg
212 remove</command> command to remove a file, it will undo
213 the <quote>removed</quote> state of the file, and restore
214 the file to its unmodified contents.</para>
215 </listitem></itemizedlist>
216
217 <sect2 id="sec:undo:mgmt">
218 <title>File management errors</title>
219
220 <para>The <command role="hg-cmd">hg revert</command> command is
221 useful for more than just modified files. It lets you reverse
222 the results of all of Mercurial's file management
223 commands&emdash;<command role="hg-cmd">hg add</command>,
224 <command role="hg-cmd">hg remove</command>, and so on.</para>
225
226 <para>If you <command role="hg-cmd">hg add</command> a file,
227 then decide that in fact you don't want Mercurial to track it,
228 use <command role="hg-cmd">hg revert</command> to undo the
229 add. Don't worry; Mercurial will not modify the file in any
230 way. It will just <quote>unmark</quote> the file.</para>
231
232 &interaction.daily.revert.add;
233
234 <para>Similarly, if you ask Mercurial to <command
235 role="hg-cmd">hg remove</command> a file, you can use
236 <command role="hg-cmd">hg revert</command> to restore it to
237 the contents it had as of the parent of the working directory.
238 &interaction.daily.revert.remove; This works just as
239 well for a file that you deleted by hand, without telling
240 Mercurial (recall that in Mercurial terminology, this kind of
241 file is called <quote>missing</quote>).</para>
242
243 &interaction.daily.revert.missing;
244
245 <para>If you revert a <command role="hg-cmd">hg copy</command>,
246 the copied-to file remains in your working directory
247 afterwards, untracked. Since a copy doesn't affect the
248 copied-from file in any way, Mercurial doesn't do anything
249 with the copied-from file.</para>
250
251 &interaction.daily.revert.copy;
252
253 <sect3>
254 <title>A slightly special case: reverting a rename</title>
255
256 <para>If you <command role="hg-cmd">hg rename</command> a
257 file, there is one small detail that you should remember.
258 When you <command role="hg-cmd">hg revert</command> a
259 rename, it's not enough to provide the name of the
260 renamed-to file, as you can see here.</para>
261
262 &interaction.daily.revert.rename;
263
264 <para>As you can see from the output of <command
265 role="hg-cmd">hg status</command>, the renamed-to file is
266 no longer identified as added, but the
267 renamed-<emphasis>from</emphasis> file is still removed!
268 This is counter-intuitive (at least to me), but at least
269 it's easy to deal with.</para>
270
271 &interaction.daily.revert.rename-orig;
272
273 <para>So remember, to revert a <command role="hg-cmd">hg
274 rename</command>, you must provide
275 <emphasis>both</emphasis> the source and destination
276 names.</para>
277
278 <para>% TODO: the output doesn't look like it will be
279 removed!</para>
280
281 <para>(By the way, if you rename a file, then modify the
282 renamed-to file, then revert both components of the rename,
283 when Mercurial restores the file that was removed as part of
284 the rename, it will be unmodified. If you need the
285 modifications in the renamed-to file to show up in the
286 renamed-from file, don't forget to copy them over.)</para>
287
288 <para>These fiddly aspects of reverting a rename arguably
289 constitute a small bug in Mercurial.</para>
290
291 </sect3>
292 </sect2>
293 </sect1>
294 <sect1>
295 <title>Dealing with committed changes</title>
296
297 <para>Consider a case where you have committed a change $a$, and
298 another change $b$ on top of it; you then realise that change
299 $a$ was incorrect. Mercurial lets you <quote>back out</quote>
300 an entire changeset automatically, and building blocks that let
301 you reverse part of a changeset by hand.</para>
302
303 <para>Before you read this section, here's something to keep in
304 mind: the <command role="hg-cmd">hg backout</command> command
305 undoes changes by <emphasis>adding</emphasis> history, not by
306 modifying or erasing it. It's the right tool to use if you're
307 fixing bugs, but not if you're trying to undo some change that
308 has catastrophic consequences. To deal with those, see section
309 <xref linkend="sec:undo:aaaiiieee"/>.</para>
310
311 <sect2>
312 <title>Backing out a changeset</title>
313
314 <para>The <command role="hg-cmd">hg backout</command> command
315 lets you <quote>undo</quote> the effects of an entire
316 changeset in an automated fashion. Because Mercurial's
317 history is immutable, this command <emphasis>does
318 not</emphasis> get rid of the changeset you want to undo.
319 Instead, it creates a new changeset that
320 <emphasis>reverses</emphasis> the effect of the to-be-undone
321 changeset.</para>
322
323 <para>The operation of the <command role="hg-cmd">hg
324 backout</command> command is a little intricate, so let's
325 illustrate it with some examples. First, we'll create a
326 repository with some simple changes.</para>
327
328 &interaction.backout.init;
329
330 <para>The <command role="hg-cmd">hg backout</command> command
331 takes a single changeset ID as its argument; this is the
332 changeset to back out. Normally, <command role="hg-cmd">hg
333 backout</command> will drop you into a text editor to write
334 a commit message, so you can record why you're backing the
335 change out. In this example, we provide a commit message on
336 the command line using the <option
337 role="hg-opt-backout">-m</option> option.</para>
338
339 </sect2>
340 <sect2>
341 <title>Backing out the tip changeset</title>
342
343 <para>We're going to start by backing out the last changeset we
344 committed.</para>
345
346 &interaction.backout.simple;
347
348 <para>You can see that the second line from
349 <filename>myfile</filename> is no longer present. Taking a
350 look at the output of <command role="hg-cmd">hg log</command>
351 gives us an idea of what the <command role="hg-cmd">hg
352 backout</command> command has done.
353 &interaction.backout.simple.log; Notice that the new changeset
354 that <command role="hg-cmd">hg backout</command> has created
355 is a child of the changeset we backed out. It's easier to see
356 this in figure <xref
357 linkend="fig:undo:backout"/>, which presents a graphical
358 view of the change history. As you can see, the history is
359 nice and linear.</para>
360
361 <informalfigure id="fig:undo:backout">
362 <mediaobject><imageobject><imagedata
363 fileref="undo-simple"/></imageobject><textobject><phrase>XXX
364 add text</phrase></textobject><caption><para>Backing out
365 a change using the <command role="hg-cmd">hg
366 backout</command>
367 command</para></caption></mediaobject>
368
369 </informalfigure>
370
371 </sect2>
372 <sect2>
373 <title>Backing out a non-tip change</title>
374
375 <para>If you want to back out a change other than the last one
376 you committed, pass the <option
377 role="hg-opt-backout">--merge</option> option to the
378 <command role="hg-cmd">hg backout</command> command.</para>
379
380 &interaction.backout.non-tip.clone;
381
382 <para>This makes backing out any changeset a
383 <quote>one-shot</quote> operation that's usually simple and
384 fast.</para>
385
386 &interaction.backout.non-tip.backout;
387
388 <para>If you take a look at the contents of
389 <filename>myfile</filename> after the backout finishes, you'll
390 see that the first and third changes are present, but not the
391 second.</para>
392
393 &interaction.backout.non-tip.cat;
394
395 <para>As the graphical history in figure <xref
396 linkend="fig:undo:backout-non-tip"/> illustrates, Mercurial
397 actually commits <emphasis>two</emphasis> changes in this kind
398 of situation (the box-shaped nodes are the ones that Mercurial
399 commits automatically). Before Mercurial begins the backout
400 process, it first remembers what the current parent of the
401 working directory is. It then backs out the target changeset,
402 and commits that as a changeset. Finally, it merges back to
403 the previous parent of the working directory, and commits the
404 result of the merge.</para>
405
406 <para>% TODO: to me it looks like mercurial doesn't commit the
407 second merge automatically!</para>
408
409 <informalfigure id="fig:undo:backout-non-tip">
410 <mediaobject><imageobject><imagedata
411 fileref="undo-non-tip"/></imageobject><textobject><phrase>XXX
412 add text</phrase></textobject><caption><para>Automated
413 backout of a non-tip change using the <command
414 role="hg-cmd">hg backout</command>
415 command</para></caption></mediaobject>
416 </informalfigure>
417
418 <para>The result is that you end up <quote>back where you
419 were</quote>, only with some extra history that undoes the
420 effect of the changeset you wanted to back out.</para>
421
422 <sect3>
423 <title>Always use the <option
424 role="hg-opt-backout">--merge</option> option</title>
425
426 <para>In fact, since the <option
427 role="hg-opt-backout">--merge</option> option will do the
428 <quote>right thing</quote> whether or not the changeset
429 you're backing out is the tip (i.e. it won't try to merge if
430 it's backing out the tip, since there's no need), you should
431 <emphasis>always</emphasis> use this option when you run the
432 <command role="hg-cmd">hg backout</command> command.</para>
433
434 </sect3>
435 </sect2>
436 <sect2>
437 <title>Gaining more control of the backout process</title>
438
439 <para>While I've recommended that you always use the <option
440 role="hg-opt-backout">--merge</option> option when backing
441 out a change, the <command role="hg-cmd">hg backout</command>
442 command lets you decide how to merge a backout changeset.
443 Taking control of the backout process by hand is something you
444 will rarely need to do, but it can be useful to understand
445 what the <command role="hg-cmd">hg backout</command> command
446 is doing for you automatically. To illustrate this, let's
447 clone our first repository, but omit the backout change that
448 it contains.</para>
449
450 &interaction.backout.manual.clone;
451
452 <para>As with our
453 earlier example, We'll commit a third changeset, then back out
454 its parent, and see what happens.</para>
455
456 &interaction.backout.manual.backout;
457
458 <para>Our new changeset is again a descendant of the changeset
459 we backout out; it's thus a new head, <emphasis>not</emphasis>
460 a descendant of the changeset that was the tip. The <command
461 role="hg-cmd">hg backout</command> command was quite
462 explicit in telling us this.</para>
463
464 &interaction.backout.manual.log;
465
466 <para>Again, it's easier to see what has happened by looking at
467 a graph of the revision history, in figure <xref
468 linkend="fig:undo:backout-manual"/>. This makes it clear
469 that when we use <command role="hg-cmd">hg backout</command>
470 to back out a change other than the tip, Mercurial adds a new
471 head to the repository (the change it committed is
472 box-shaped).</para>
473
474 <informalfigure id="fig:undo:backout-manual">
475 <mediaobject><imageobject><imagedata
476 fileref="undo-manual"/></imageobject><textobject><phrase>XXX
477 add text</phrase></textobject><caption><para>Backing out
478 a change using the <command role="hg-cmd">hg
479 backout</command>
480 command</para></caption></mediaobject>
481
482 </informalfigure>
483
484 <para>After the <command role="hg-cmd">hg backout</command>
485 command has completed, it leaves the new
486 <quote>backout</quote> changeset as the parent of the working
487 directory.</para>
488
489 &interaction.backout.manual.parents;
490
491 <para>Now we have two isolated sets of changes.</para>
492
493 &interaction.backout.manual.heads;
494
495 <para>Let's think about what we expect to see as the contents of
496 <filename>myfile</filename> now. The first change should be
497 present, because we've never backed it out. The second change
498 should be missing, as that's the change we backed out. Since
499 the history graph shows the third change as a separate head,
500 we <emphasis>don't</emphasis> expect to see the third change
501 present in <filename>myfile</filename>.</para>
502
503 &interaction.backout.manual.cat;
504
505 <para>To get the third change back into the file, we just do a
506 normal merge of our two heads.</para>
507
508 &interaction.backout.manual.merge;
509
510 <para>Afterwards, the graphical history of our repository looks
511 like figure
512 <xref linkend="fig:undo:backout-manual-merge"/>.</para>
513
514 <informalfigure id="fig:undo:backout-manual-merge">
515 <mediaobject><imageobject><imagedata
516 fileref="undo-manual-merge"/></imageobject><textobject><phrase>XXX
517 add text</phrase></textobject><caption><para>Manually
518 merging a backout change</para></caption></mediaobject>
519
520 </informalfigure>
521
522 </sect2>
523 <sect2>
524 <title>Why <command role="hg-cmd">hg backout</command> works as
525 it does</title>
526
527 <para>Here's a brief description of how the <command
528 role="hg-cmd">hg backout</command> command works.</para>
529 <orderedlist>
530 <listitem><para>It ensures that the working directory is
531 <quote>clean</quote>, i.e. that the output of <command
532 role="hg-cmd">hg status</command> would be empty.</para>
533 </listitem>
534 <listitem><para>It remembers the current parent of the working
535 directory. Let's call this changeset
536 <literal>orig</literal></para>
537 </listitem>
538 <listitem><para>It does the equivalent of a <command
539 role="hg-cmd">hg update</command> to sync the working
540 directory to the changeset you want to back out. Let's
541 call this changeset <literal>backout</literal></para>
542 </listitem>
543 <listitem><para>It finds the parent of that changeset. Let's
544 call that changeset <literal>parent</literal>.</para>
545 </listitem>
546 <listitem><para>For each file that the
547 <literal>backout</literal> changeset affected, it does the
548 equivalent of a <command role="hg-cmd">hg revert -r
549 parent</command> on that file, to restore it to the
550 contents it had before that changeset was
551 committed.</para>
552 </listitem>
553 <listitem><para>It commits the result as a new changeset.
554 This changeset has <literal>backout</literal> as its
555 parent.</para>
556 </listitem>
557 <listitem><para>If you specify <option
558 role="hg-opt-backout">--merge</option> on the command
559 line, it merges with <literal>orig</literal>, and commits
560 the result of the merge.</para>
561 </listitem></orderedlist>
562
563 <para>An alternative way to implement the <command
564 role="hg-cmd">hg backout</command> command would be to
565 <command role="hg-cmd">hg export</command> the
566 to-be-backed-out changeset as a diff, then use the <option
567 role="cmd-opt-patch">--reverse</option> option to the
568 <command>patch</command> command to reverse the effect of the
569 change without fiddling with the working directory. This
570 sounds much simpler, but it would not work nearly as
571 well.</para>
572
573 <para>The reason that <command role="hg-cmd">hg
574 backout</command> does an update, a commit, a merge, and
575 another commit is to give the merge machinery the best chance
576 to do a good job when dealing with all the changes
577 <emphasis>between</emphasis> the change you're backing out and
578 the current tip.</para>
579
580 <para>If you're backing out a changeset that's 100 revisions
581 back in your project's history, the chances that the
582 <command>patch</command> command will be able to apply a
583 reverse diff cleanly are not good, because intervening changes
584 are likely to have <quote>broken the context</quote> that
585 <command>patch</command> uses to determine whether it can
586 apply a patch (if this sounds like gibberish, see <xref
587 linkend="sec:mq:patch"/> for a
588 discussion of the <command>patch</command> command). Also,
589 Mercurial's merge machinery will handle files and directories
590 being renamed, permission changes, and modifications to binary
591 files, none of which <command>patch</command> can deal
592 with.</para>
593
594 </sect2>
595 </sect1>
596 <sect1 id="sec:undo:aaaiiieee">
597 <title>Changes that should never have been</title>
598
599 <para>Most of the time, the <command role="hg-cmd">hg
600 backout</command> command is exactly what you need if you want
601 to undo the effects of a change. It leaves a permanent record
602 of exactly what you did, both when committing the original
603 changeset and when you cleaned up after it.</para>
604
605 <para>On rare occasions, though, you may find that you've
606 committed a change that really should not be present in the
607 repository at all. For example, it would be very unusual, and
608 usually considered a mistake, to commit a software project's
609 object files as well as its source files. Object files have
610 almost no intrinsic value, and they're <emphasis>big</emphasis>,
611 so they increase the size of the repository and the amount of
612 time it takes to clone or pull changes.</para>
613
614 <para>Before I discuss the options that you have if you commit a
615 <quote>brown paper bag</quote> change (the kind that's so bad
616 that you want to pull a brown paper bag over your head), let me
617 first discuss some approaches that probably won't work.</para>
618
619 <para>Since Mercurial treats history as accumulative&emdash;every
620 change builds on top of all changes that preceded it&emdash;you
621 generally can't just make disastrous changes disappear. The one
622 exception is when you've just committed a change, and it hasn't
623 been pushed or pulled into another repository. That's when you
624 can safely use the <command role="hg-cmd">hg rollback</command>
625 command, as I detailed in section <xref
626 linkend="sec:undo:rollback"/>.</para>
627
628 <para>After you've pushed a bad change to another repository, you
629 <emphasis>could</emphasis> still use <command role="hg-cmd">hg
630 rollback</command> to make your local copy of the change
631 disappear, but it won't have the consequences you want. The
632 change will still be present in the remote repository, so it
633 will reappear in your local repository the next time you
634 pull.</para>
635
636 <para>If a situation like this arises, and you know which
637 repositories your bad change has propagated into, you can
638 <emphasis>try</emphasis> to get rid of the changeefrom
639 <emphasis>every</emphasis> one of those repositories. This is,
640 of course, not a satisfactory solution: if you miss even a
641 single repository while you're expunging, the change is still
642 <quote>in the wild</quote>, and could propagate further.</para>
643
644 <para>If you've committed one or more changes
645 <emphasis>after</emphasis> the change that you'd like to see
646 disappear, your options are further reduced. Mercurial doesn't
647 provide a way to <quote>punch a hole</quote> in history, leaving
648 changesets intact.</para>
649
650 <para>XXX This needs filling out. The
651 <literal>hg-replay</literal> script in the
652 <literal>examples</literal> directory works, but doesn't handle
653 merge changesets. Kind of an important omission.</para>
654
655 <sect2>
656 <title>Protect yourself from <quote>escaped</quote>
657 changes</title>
658
659 <para>If you've committed some changes to your local repository
660 and they've been pushed or pulled somewhere else, this isn't
661 necessarily a disaster. You can protect yourself ahead of
662 time against some classes of bad changeset. This is
663 particularly easy if your team usually pulls changes from a
664 central repository.</para>
665
666 <para>By configuring some hooks on that repository to validate
667 incoming changesets (see chapter <xref linkend="chap:hook"/>),
668 you can
669 automatically prevent some kinds of bad changeset from being
670 pushed to the central repository at all. With such a
671 configuration in place, some kinds of bad changeset will
672 naturally tend to <quote>die out</quote> because they can't
673 propagate into the central repository. Better yet, this
674 happens without any need for explicit intervention.</para>
675
676 <para>For instance, an incoming change hook that verifies that a
677 changeset will actually compile can prevent people from
678 inadvertantly <quote>breaking the build</quote>.</para>
679
680 </sect2>
681 </sect1>
682 <sect1 id="sec:undo:bisect">
683 <title>Finding the source of a bug</title>
684
685 <para>While it's all very well to be able to back out a changeset
686 that introduced a bug, this requires that you know which
687 changeset to back out. Mercurial provides an invaluable
688 command, called <command role="hg-cmd">hg bisect</command>, that
689 helps you to automate this process and accomplish it very
690 efficiently.</para>
691
692 <para>The idea behind the <command role="hg-cmd">hg
693 bisect</command> command is that a changeset has introduced
694 some change of behaviour that you can identify with a simple
695 binary test. You don't know which piece of code introduced the
696 change, but you know how to test for the presence of the bug.
697 The <command role="hg-cmd">hg bisect</command> command uses your
698 test to direct its search for the changeset that introduced the
699 code that caused the bug.</para>
700
701 <para>Here are a few scenarios to help you understand how you
702 might apply this command.</para>
703 <itemizedlist>
704 <listitem><para>The most recent version of your software has a
705 bug that you remember wasn't present a few weeks ago, but
706 you don't know when it was introduced. Here, your binary
707 test checks for the presence of that bug.</para>
708 </listitem>
709 <listitem><para>You fixed a bug in a rush, and now it's time to
710 close the entry in your team's bug database. The bug
711 database requires a changeset ID when you close an entry,
712 but you don't remember which changeset you fixed the bug in.
713 Once again, your binary test checks for the presence of the
714 bug.</para>
715 </listitem>
716 <listitem><para>Your software works correctly, but runs 15%
717 slower than the last time you measured it. You want to know
718 which changeset introduced the performance regression. In
719 this case, your binary test measures the performance of your
720 software, to see whether it's <quote>fast</quote> or
721 <quote>slow</quote>.</para>
722 </listitem>
723 <listitem><para>The sizes of the components of your project that
724 you ship exploded recently, and you suspect that something
725 changed in the way you build your project.</para>
726 </listitem></itemizedlist>
727
728 <para>From these examples, it should be clear that the <command
729 role="hg-cmd">hg bisect</command> command is not useful only
730 for finding the sources of bugs. You can use it to find any
731 <quote>emergent property</quote> of a repository (anything that
732 you can't find from a simple text search of the files in the
733 tree) for which you can write a binary test.</para>
734
735 <para>We'll introduce a little bit of terminology here, just to
736 make it clear which parts of the search process are your
737 responsibility, and which are Mercurial's. A
738 <emphasis>test</emphasis> is something that
739 <emphasis>you</emphasis> run when <command role="hg-cmd">hg
740 bisect</command> chooses a changeset. A
741 <emphasis>probe</emphasis> is what <command role="hg-cmd">hg
742 bisect</command> runs to tell whether a revision is good.
743 Finally, we'll use the word <quote>bisect</quote>, as both a
744 noun and a verb, to stand in for the phrase <quote>search using
745 the <command role="hg-cmd">hg bisect</command>
746 command</quote>.</para>
747
748 <para>One simple way to automate the searching process would be
749 simply to probe every changeset. However, this scales poorly.
750 If it took ten minutes to test a single changeset, and you had
751 10,000 changesets in your repository, the exhaustive approach
752 would take on average 35 <emphasis>days</emphasis> to find the
753 changeset that introduced a bug. Even if you knew that the bug
754 was introduced by one of the last 500 changesets, and limited
755 your search to those, you'd still be looking at over 40 hours to
756 find the changeset that introduced your bug.</para>
757
758 <para>What the <command role="hg-cmd">hg bisect</command> command
759 does is use its knowledge of the <quote>shape</quote> of your
760 project's revision history to perform a search in time
761 proportional to the <emphasis>logarithm</emphasis> of the number
762 of changesets to check (the kind of search it performs is called
763 a dichotomic search). With this approach, searching through
764 10,000 changesets will take less than three hours, even at ten
765 minutes per test (the search will require about 14 tests).
766 Limit your search to the last hundred changesets, and it will
767 take only about an hour (roughly seven tests).</para>
768
769 <para>The <command role="hg-cmd">hg bisect</command> command is
770 aware of the <quote>branchy</quote> nature of a Mercurial
771 project's revision history, so it has no problems dealing with
772 branches, merges, or multiple heads in a repository. It can
773 prune entire branches of history with a single probe, which is
774 how it operates so efficiently.</para>
775
776 <sect2>
777 <title>Using the <command role="hg-cmd">hg bisect</command>
778 command</title>
779
780 <para>Here's an example of <command role="hg-cmd">hg
781 bisect</command> in action.</para>
782
783 <note>
784 <para> In versions 0.9.5 and earlier of Mercurial, <command
785 role="hg-cmd">hg bisect</command> was not a core command:
786 it was distributed with Mercurial as an extension. This
787 section describes the built-in command, not the old
788 extension.</para>
789 </note>
790
791 <para>Now let's create a repository, so that we can try out the
792 <command role="hg-cmd">hg bisect</command> command in
793 isolation.</para>
794
795 &interaction.bisect.init;
796
797 <para>We'll simulate a project that has a bug in it in a
798 simple-minded way: create trivial changes in a loop, and
799 nominate one specific change that will have the
800 <quote>bug</quote>. This loop creates 35 changesets, each
801 adding a single file to the repository. We'll represent our
802 <quote>bug</quote> with a file that contains the text <quote>i
803 have a gub</quote>.</para>
804
805 &interaction.bisect.commits;
806
807 <para>The next thing that we'd like to do is figure out how to
808 use the <command role="hg-cmd">hg bisect</command> command.
809 We can use Mercurial's normal built-in help mechanism for
810 this.</para>
811
812 &interaction.bisect.help;
813
814 <para>The <command role="hg-cmd">hg bisect</command> command
815 works in steps. Each step proceeds as follows.</para>
816 <orderedlist>
817 <listitem><para>You run your binary test.</para>
818 <itemizedlist>
819 <listitem><para>If the test succeeded, you tell <command
820 role="hg-cmd">hg bisect</command> by running the
821 <command role="hg-cmd">hg bisect good</command>
822 command.</para>
823 </listitem>
824 <listitem><para>If it failed, run the <command
825 role="hg-cmd">hg bisect bad</command>
826 command.</para></listitem></itemizedlist>
827 </listitem>
828 <listitem><para>The command uses your information to decide
829 which changeset to test next.</para>
830 </listitem>
831 <listitem><para>It updates the working directory to that
832 changeset, and the process begins again.</para>
833 </listitem></orderedlist>
834 <para>The process ends when <command role="hg-cmd">hg
835 bisect</command> identifies a unique changeset that marks
836 the point where your test transitioned from
837 <quote>succeeding</quote> to <quote>failing</quote>.</para>
838
839 <para>To start the search, we must run the <command
840 role="hg-cmd">hg bisect --reset</command> command.</para>
841
842 &interaction.bisect.search.init;
843
844 <para>In our case, the binary test we use is simple: we check to
845 see if any file in the repository contains the string <quote>i
846 have a gub</quote>. If it does, this changeset contains the
847 change that <quote>caused the bug</quote>. By convention, a
848 changeset that has the property we're searching for is
849 <quote>bad</quote>, while one that doesn't is
850 <quote>good</quote>.</para>
851
852 <para>Most of the time, the revision to which the working
853 directory is synced (usually the tip) already exhibits the
854 problem introduced by the buggy change, so we'll mark it as
855 <quote>bad</quote>.</para>
856
857 &interaction.bisect.search.bad-init;
858
859 <para>Our next task is to nominate a changeset that we know
860 <emphasis>doesn't</emphasis> have the bug; the <command
861 role="hg-cmd">hg bisect</command> command will
862 <quote>bracket</quote> its search between the first pair of
863 good and bad changesets. In our case, we know that revision
864 10 didn't have the bug. (I'll have more words about choosing
865 the first <quote>good</quote> changeset later.)</para>
866
867 &interaction.bisect.search.good-init;
868
869 <para>Notice that this command printed some output.</para>
870 <itemizedlist>
871 <listitem><para>It told us how many changesets it must
872 consider before it can identify the one that introduced
873 the bug, and how many tests that will require.</para>
874 </listitem>
875 <listitem><para>It updated the working directory to the next
876 changeset to test, and told us which changeset it's
877 testing.</para>
878 </listitem></itemizedlist>
879
880 <para>We now run our test in the working directory. We use the
881 <command>grep</command> command to see if our
882 <quote>bad</quote> file is present in the working directory.
883 If it is, this revision is bad; if not, this revision is good.
884 &interaction.bisect.search.step1;</para>
885
886 <para>This test looks like a perfect candidate for automation,
887 so let's turn it into a shell function.</para>
888 &interaction.bisect.search.mytest;
889
890 <para>We can now run an entire test step with a single command,
891 <literal>mytest</literal>.</para>
892
893 &interaction.bisect.search.step2;
894
895 <para>A few more invocations of our canned test step command,
896 and we're done.</para>
897
898 &interaction.bisect.search.rest;
899
900 <para>Even though we had 40 changesets to search through, the
901 <command role="hg-cmd">hg bisect</command> command let us find
902 the changeset that introduced our <quote>bug</quote> with only
903 five tests. Because the number of tests that the <command
904 role="hg-cmd">hg bisect</command> command performs grows
905 logarithmically with the number of changesets to search, the
906 advantage that it has over the <quote>brute force</quote>
907 search approach increases with every changeset you add.</para>
908
909 </sect2>
910 <sect2>
911 <title>Cleaning up after your search</title>
912
913 <para>When you're finished using the <command role="hg-cmd">hg
914 bisect</command> command in a repository, you can use the
915 <command role="hg-cmd">hg bisect reset</command> command to
916 drop the information it was using to drive your search. The
917 command doesn't use much space, so it doesn't matter if you
918 forget to run this command. However, <command
919 role="hg-cmd">hg bisect</command> won't let you start a new
920 search in that repository until you do a <command
921 role="hg-cmd">hg bisect reset</command>.</para>
922
923 &interaction.bisect.search.reset;
924
925 </sect2>
926 </sect1>
927 <sect1>
928 <title>Tips for finding bugs effectively</title>
929
930 <sect2>
931 <title>Give consistent input</title>
932
933 <para>The <command role="hg-cmd">hg bisect</command> command
934 requires that you correctly report the result of every test
935 you perform. If you tell it that a test failed when it really
936 succeeded, it <emphasis>might</emphasis> be able to detect the
937 inconsistency. If it can identify an inconsistency in your
938 reports, it will tell you that a particular changeset is both
939 good and bad. However, it can't do this perfectly; it's about
940 as likely to report the wrong changeset as the source of the
941 bug.</para>
942
943 </sect2>
944 <sect2>
945 <title>Automate as much as possible</title>
946
947 <para>When I started using the <command role="hg-cmd">hg
948 bisect</command> command, I tried a few times to run my
949 tests by hand, on the command line. This is an approach that
950 I, at least, am not suited to. After a few tries, I found
951 that I was making enough mistakes that I was having to restart
952 my searches several times before finally getting correct
953 results.</para>
954
955 <para>My initial problems with driving the <command
956 role="hg-cmd">hg bisect</command> command by hand occurred
957 even with simple searches on small repositories; if the
958 problem you're looking for is more subtle, or the number of
959 tests that <command role="hg-cmd">hg bisect</command> must
960 perform increases, the likelihood of operator error ruining
961 the search is much higher. Once I started automating my
962 tests, I had much better results.</para>
963
964 <para>The key to automated testing is twofold:</para>
965 <itemizedlist>
966 <listitem><para>always test for the same symptom, and</para>
967 </listitem>
968 <listitem><para>always feed consistent input to the <command
969 role="hg-cmd">hg bisect</command> command.</para>
970 </listitem></itemizedlist>
971 <para>In my tutorial example above, the <command>grep</command>
972 command tests for the symptom, and the <literal>if</literal>
973 statement takes the result of this check and ensures that we
974 always feed the same input to the <command role="hg-cmd">hg
975 bisect</command> command. The <literal>mytest</literal>
976 function marries these together in a reproducible way, so that
977 every test is uniform and consistent.</para>
978
979 </sect2>
980 <sect2>
981 <title>Check your results</title>
982
983 <para>Because the output of a <command role="hg-cmd">hg
984 bisect</command> search is only as good as the input you
985 give it, don't take the changeset it reports as the absolute
986 truth. A simple way to cross-check its report is to manually
987 run your test at each of the following changesets:</para>
988 <itemizedlist>
989 <listitem><para>The changeset that it reports as the first bad
990 revision. Your test should still report this as
991 bad.</para>
992 </listitem>
993 <listitem><para>The parent of that changeset (either parent,
994 if it's a merge). Your test should report this changeset
995 as good.</para>
996 </listitem>
997 <listitem><para>A child of that changeset. Your test should
998 report this changeset as bad.</para>
999 </listitem></itemizedlist>
1000
1001 </sect2>
1002 <sect2>
1003 <title>Beware interference between bugs</title>
1004
1005 <para>It's possible that your search for one bug could be
1006 disrupted by the presence of another. For example, let's say
1007 your software crashes at revision 100, and worked correctly at
1008 revision 50. Unknown to you, someone else introduced a
1009 different crashing bug at revision 60, and fixed it at
1010 revision 80. This could distort your results in one of
1011 several ways.</para>
1012
1013 <para>It is possible that this other bug completely
1014 <quote>masks</quote> yours, which is to say that it occurs
1015 before your bug has a chance to manifest itself. If you can't
1016 avoid that other bug (for example, it prevents your project
1017 from building), and so can't tell whether your bug is present
1018 in a particular changeset, the <command role="hg-cmd">hg
1019 bisect</command> command cannot help you directly. Instead,
1020 you can mark a changeset as untested by running <command
1021 role="hg-cmd">hg bisect --skip</command>.</para>
1022
1023 <para>A different problem could arise if your test for a bug's
1024 presence is not specific enough. If you check for <quote>my
1025 program crashes</quote>, then both your crashing bug and an
1026 unrelated crashing bug that masks it will look like the same
1027 thing, and mislead <command role="hg-cmd">hg
1028 bisect</command>.</para>
1029
1030 <para>Another useful situation in which to use <command
1031 role="hg-cmd">hg bisect --skip</command> is if you can't
1032 test a revision because your project was in a broken and hence
1033 untestable state at that revision, perhaps because someone
1034 checked in a change that prevented the project from
1035 building.</para>
1036
1037 </sect2>
1038 <sect2>
1039 <title>Bracket your search lazily</title>
1040
1041 <para>Choosing the first <quote>good</quote> and
1042 <quote>bad</quote> changesets that will mark the end points of
1043 your search is often easy, but it bears a little discussion
1044 nevertheless. From the perspective of <command
1045 role="hg-cmd">hg bisect</command>, the <quote>newest</quote>
1046 changeset is conventionally <quote>bad</quote>, and the older
1047 changeset is <quote>good</quote>.</para>
1048
1049 <para>If you're having trouble remembering when a suitable
1050 <quote>good</quote> change was, so that you can tell <command
1051 role="hg-cmd">hg bisect</command>, you could do worse than
1052 testing changesets at random. Just remember to eliminate
1053 contenders that can't possibly exhibit the bug (perhaps
1054 because the feature with the bug isn't present yet) and those
1055 where another problem masks the bug (as I discussed
1056 above).</para>
1057
1058 <para>Even if you end up <quote>early</quote> by thousands of
1059 changesets or months of history, you will only add a handful
1060 of tests to the total number that <command role="hg-cmd">hg
1061 bisect</command> must perform, thanks to its logarithmic
1062 behaviour.</para>
1063
1064 </sect2>
1065 </sect1>
1066 </chapter>
1067
1068 <!--
1069 local variables:
1070 sgml-parent-document: ("00book.xml" "book" "chapter")
1071 end:
1072 -->