comparison en/ch08-undo.xml @ 749:7e7c47481e4f

Oops, this is the real merge for my hg's oddity
author Dongsheng Song <dongsheng.song@gmail.com>
date Fri, 20 Mar 2009 16:43:35 +0800
parents en/ch09-undo.xml@a13813534ccd
children 1c13ed2130a7
comparison
equal deleted inserted replaced
748:d13c7c706a58 749:7e7c47481e4f
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
2
3 <chapter id="chap.undo">
4 <?dbhtml filename="finding-and-fixing-mistakes.html"?>
5 <title>Finding and fixing mistakes</title>
6
7 <para>To err might be human, but to really handle the consequences
8 well takes a top-notch revision control system. In this chapter,
9 we'll discuss some of the techniques you can use when you find
10 that a problem has crept into your project. Mercurial has some
11 highly capable features that will help you to isolate the sources
12 of problems, and to handle them appropriately.</para>
13
14 <sect1>
15 <title>Erasing local history</title>
16
17 <sect2>
18 <title>The accidental commit</title>
19
20 <para>I have the occasional but persistent problem of typing
21 rather more quickly than I can think, which sometimes results
22 in me committing a changeset that is either incomplete or
23 plain wrong. In my case, the usual kind of incomplete
24 changeset is one in which I've created a new source file, but
25 forgotten to <command role="hg-cmd">hg add</command> it. A
26 <quote>plain wrong</quote> changeset is not as common, but no
27 less annoying.</para>
28
29 </sect2>
30 <sect2 id="sec.undo.rollback">
31 <title>Rolling back a transaction</title>
32
33 <para>In section <xref linkend="sec.concepts.txn"/>, I mentioned
34 that Mercurial treats each modification of a repository as a
35 <emphasis>transaction</emphasis>. Every time you commit a
36 changeset or pull changes from another repository, Mercurial
37 remembers what you did. You can undo, or <emphasis>roll
38 back</emphasis>, exactly one of these actions using the
39 <command role="hg-cmd">hg rollback</command> command. (See
40 section <xref linkend="sec.undo.rollback-after-push"/> for an
41 important caveat about the use of this command.)</para>
42
43 <para>Here's a mistake that I often find myself making:
44 committing a change in which I've created a new file, but
45 forgotten to <command role="hg-cmd">hg add</command>
46 it.</para>
47
48 &interaction.rollback.commit;
49
50 <para>Looking at the output of <command role="hg-cmd">hg
51 status</command> after the commit immediately confirms the
52 error.</para>
53
54 &interaction.rollback.status;
55
56 <para>The commit captured the changes to the file
57 <filename>a</filename>, but not the new file
58 <filename>b</filename>. If I were to push this changeset to a
59 repository that I shared with a colleague, the chances are
60 high that something in <filename>a</filename> would refer to
61 <filename>b</filename>, which would not be present in their
62 repository when they pulled my changes. I would thus become
63 the object of some indignation.</para>
64
65 <para>However, luck is with me&emdash;I've caught my error
66 before I pushed the changeset. I use the <command
67 role="hg-cmd">hg rollback</command> command, and Mercurial
68 makes that last changeset vanish.</para>
69
70 &interaction.rollback.rollback;
71
72 <para>Notice that the changeset is no longer present in the
73 repository's history, and the working directory once again
74 thinks that the file <filename>a</filename> is modified. The
75 commit and rollback have left the working directory exactly as
76 it was prior to the commit; the changeset has been completely
77 erased. I can now safely <command role="hg-cmd">hg
78 add</command> the file <filename>b</filename>, and rerun my
79 commit.</para>
80
81 &interaction.rollback.add;
82
83 </sect2>
84 <sect2>
85 <title>The erroneous pull</title>
86
87 <para>It's common practice with Mercurial to maintain separate
88 development branches of a project in different repositories.
89 Your development team might have one shared repository for
90 your project's <quote>0.9</quote> release, and another,
91 containing different changes, for the <quote>1.0</quote>
92 release.</para>
93
94 <para>Given this, you can imagine that the consequences could be
95 messy if you had a local <quote>0.9</quote> repository, and
96 accidentally pulled changes from the shared <quote>1.0</quote>
97 repository into it. At worst, you could be paying
98 insufficient attention, and push those changes into the shared
99 <quote>0.9</quote> tree, confusing your entire team (but don't
100 worry, we'll return to this horror scenario later). However,
101 it's more likely that you'll notice immediately, because
102 Mercurial will display the URL it's pulling from, or you will
103 see it pull a suspiciously large number of changes into the
104 repository.</para>
105
106 <para>The <command role="hg-cmd">hg rollback</command> command
107 will work nicely to expunge all of the changesets that you
108 just pulled. Mercurial groups all changes from one <command
109 role="hg-cmd">hg pull</command> into a single transaction,
110 so one <command role="hg-cmd">hg rollback</command> is all you
111 need to undo this mistake.</para>
112
113 </sect2>
114 <sect2 id="sec.undo.rollback-after-push">
115 <title>Rolling back is useless once you've pushed</title>
116
117 <para>The value of the <command role="hg-cmd">hg
118 rollback</command> command drops to zero once you've pushed
119 your changes to another repository. Rolling back a change
120 makes it disappear entirely, but <emphasis>only</emphasis> in
121 the repository in which you perform the <command
122 role="hg-cmd">hg rollback</command>. Because a rollback
123 eliminates history, there's no way for the disappearance of a
124 change to propagate between repositories.</para>
125
126 <para>If you've pushed a change to another
127 repository&emdash;particularly if it's a shared
128 repository&emdash;it has essentially <quote>escaped into the
129 wild,</quote> and you'll have to recover from your mistake
130 in a different way. What will happen if you push a changeset
131 somewhere, then roll it back, then pull from the repository
132 you pushed to, is that the changeset will reappear in your
133 repository.</para>
134
135 <para>(If you absolutely know for sure that the change you want
136 to roll back is the most recent change in the repository that
137 you pushed to, <emphasis>and</emphasis> you know that nobody
138 else could have pulled it from that repository, you can roll
139 back the changeset there, too, but you really should really
140 not rely on this working reliably. If you do this, sooner or
141 later a change really will make it into a repository that you
142 don't directly control (or have forgotten about), and come
143 back to bite you.)</para>
144
145 </sect2>
146 <sect2>
147 <title>You can only roll back once</title>
148
149 <para>Mercurial stores exactly one transaction in its
150 transaction log; that transaction is the most recent one that
151 occurred in the repository. This means that you can only roll
152 back one transaction. If you expect to be able to roll back
153 one transaction, then its predecessor, this is not the
154 behaviour you will get.</para>
155
156 &interaction.rollback.twice;
157
158 <para>Once you've rolled back one transaction in a repository,
159 you can't roll back again in that repository until you perform
160 another commit or pull.</para>
161
162 </sect2>
163 </sect1>
164 <sect1>
165 <title>Reverting the mistaken change</title>
166
167 <para>If you make a modification to a file, and decide that you
168 really didn't want to change the file at all, and you haven't
169 yet committed your changes, the <command role="hg-cmd">hg
170 revert</command> command is the one you'll need. It looks at
171 the changeset that's the parent of the working directory, and
172 restores the contents of the file to their state as of that
173 changeset. (That's a long-winded way of saying that, in the
174 normal case, it undoes your modifications.)</para>
175
176 <para>Let's illustrate how the <command role="hg-cmd">hg
177 revert</command> command works with yet another small example.
178 We'll begin by modifying a file that Mercurial is already
179 tracking.</para>
180
181 &interaction.daily.revert.modify;
182
183 <para>If we don't
184 want that change, we can simply <command role="hg-cmd">hg
185 revert</command> the file.</para>
186
187 &interaction.daily.revert.unmodify;
188
189 <para>The <command role="hg-cmd">hg revert</command> command
190 provides us with an extra degree of safety by saving our
191 modified file with a <filename>.orig</filename>
192 extension.</para>
193
194 &interaction.daily.revert.status;
195
196 <para>Here is a summary of the cases that the <command
197 role="hg-cmd">hg revert</command> command can deal with. We
198 will describe each of these in more detail in the section that
199 follows.</para>
200 <itemizedlist>
201 <listitem><para>If you modify a file, it will restore the file
202 to its unmodified state.</para>
203 </listitem>
204 <listitem><para>If you <command role="hg-cmd">hg add</command> a
205 file, it will undo the <quote>added</quote> state of the
206 file, but leave the file itself untouched.</para>
207 </listitem>
208 <listitem><para>If you delete a file without telling Mercurial,
209 it will restore the file to its unmodified contents.</para>
210 </listitem>
211 <listitem><para>If you use the <command role="hg-cmd">hg
212 remove</command> command to remove a file, it will undo
213 the <quote>removed</quote> state of the file, and restore
214 the file to its unmodified contents.</para>
215 </listitem></itemizedlist>
216
217 <sect2 id="sec.undo.mgmt">
218 <title>File management errors</title>
219
220 <para>The <command role="hg-cmd">hg revert</command> command is
221 useful for more than just modified files. It lets you reverse
222 the results of all of Mercurial's file management
223 commands&emdash;<command role="hg-cmd">hg add</command>,
224 <command role="hg-cmd">hg remove</command>, and so on.</para>
225
226 <para>If you <command role="hg-cmd">hg add</command> a file,
227 then decide that in fact you don't want Mercurial to track it,
228 use <command role="hg-cmd">hg revert</command> to undo the
229 add. Don't worry; Mercurial will not modify the file in any
230 way. It will just <quote>unmark</quote> the file.</para>
231
232 &interaction.daily.revert.add;
233
234 <para>Similarly, if you ask Mercurial to <command
235 role="hg-cmd">hg remove</command> a file, you can use
236 <command role="hg-cmd">hg revert</command> to restore it to
237 the contents it had as of the parent of the working directory.
238 &interaction.daily.revert.remove; This works just as
239 well for a file that you deleted by hand, without telling
240 Mercurial (recall that in Mercurial terminology, this kind of
241 file is called <quote>missing</quote>).</para>
242
243 &interaction.daily.revert.missing;
244
245 <para>If you revert a <command role="hg-cmd">hg copy</command>,
246 the copied-to file remains in your working directory
247 afterwards, untracked. Since a copy doesn't affect the
248 copied-from file in any way, Mercurial doesn't do anything
249 with the copied-from file.</para>
250
251 &interaction.daily.revert.copy;
252
253 <sect3>
254 <title>A slightly special case: reverting a rename</title>
255
256 <para>If you <command role="hg-cmd">hg rename</command> a
257 file, there is one small detail that you should remember.
258 When you <command role="hg-cmd">hg revert</command> a
259 rename, it's not enough to provide the name of the
260 renamed-to file, as you can see here.</para>
261
262 &interaction.daily.revert.rename;
263
264 <para>As you can see from the output of <command
265 role="hg-cmd">hg status</command>, the renamed-to file is
266 no longer identified as added, but the
267 renamed-<emphasis>from</emphasis> file is still removed!
268 This is counter-intuitive (at least to me), but at least
269 it's easy to deal with.</para>
270
271 &interaction.daily.revert.rename-orig;
272
273 <para>So remember, to revert a <command role="hg-cmd">hg
274 rename</command>, you must provide
275 <emphasis>both</emphasis> the source and destination
276 names.</para>
277
278 <para>% TODO: the output doesn't look like it will be
279 removed!</para>
280
281 <para>(By the way, if you rename a file, then modify the
282 renamed-to file, then revert both components of the rename,
283 when Mercurial restores the file that was removed as part of
284 the rename, it will be unmodified. If you need the
285 modifications in the renamed-to file to show up in the
286 renamed-from file, don't forget to copy them over.)</para>
287
288 <para>These fiddly aspects of reverting a rename arguably
289 constitute a small bug in Mercurial.</para>
290
291 </sect3>
292 </sect2>
293 </sect1>
294 <sect1>
295 <title>Dealing with committed changes</title>
296
297 <para>Consider a case where you have committed a change $a$, and
298 another change $b$ on top of it; you then realise that change
299 $a$ was incorrect. Mercurial lets you <quote>back out</quote>
300 an entire changeset automatically, and building blocks that let
301 you reverse part of a changeset by hand.</para>
302
303 <para>Before you read this section, here's something to keep in
304 mind: the <command role="hg-cmd">hg backout</command> command
305 undoes changes by <emphasis>adding</emphasis> history, not by
306 modifying or erasing it. It's the right tool to use if you're
307 fixing bugs, but not if you're trying to undo some change that
308 has catastrophic consequences. To deal with those, see section
309 <xref linkend="sec.undo.aaaiiieee"/>.</para>
310
311 <sect2>
312 <title>Backing out a changeset</title>
313
314 <para>The <command role="hg-cmd">hg backout</command> command
315 lets you <quote>undo</quote> the effects of an entire
316 changeset in an automated fashion. Because Mercurial's
317 history is immutable, this command <emphasis>does
318 not</emphasis> get rid of the changeset you want to undo.
319 Instead, it creates a new changeset that
320 <emphasis>reverses</emphasis> the effect of the to-be-undone
321 changeset.</para>
322
323 <para>The operation of the <command role="hg-cmd">hg
324 backout</command> command is a little intricate, so let's
325 illustrate it with some examples. First, we'll create a
326 repository with some simple changes.</para>
327
328 &interaction.backout.init;
329
330 <para>The <command role="hg-cmd">hg backout</command> command
331 takes a single changeset ID as its argument; this is the
332 changeset to back out. Normally, <command role="hg-cmd">hg
333 backout</command> will drop you into a text editor to write
334 a commit message, so you can record why you're backing the
335 change out. In this example, we provide a commit message on
336 the command line using the <option
337 role="hg-opt-backout">-m</option> option.</para>
338
339 </sect2>
340 <sect2>
341 <title>Backing out the tip changeset</title>
342
343 <para>We're going to start by backing out the last changeset we
344 committed.</para>
345
346 &interaction.backout.simple;
347
348 <para>You can see that the second line from
349 <filename>myfile</filename> is no longer present. Taking a
350 look at the output of <command role="hg-cmd">hg log</command>
351 gives us an idea of what the <command role="hg-cmd">hg
352 backout</command> command has done.
353 &interaction.backout.simple.log; Notice that the new changeset
354 that <command role="hg-cmd">hg backout</command> has created
355 is a child of the changeset we backed out. It's easier to see
356 this in figure <xref
357 endterm="fig.undo.backout.caption" linkend="fig.undo.backout"/>,
358 which presents a graphical
359 view of the change history. As you can see, the history is
360 nice and linear.</para>
361
362 <informalfigure id="fig.undo.backout">
363 <mediaobject>
364 <imageobject><imagedata fileref="images/undo-simple.png"/>
365 </imageobject>
366 <textobject><phrase>XXX add text</phrase></textobject>
367 <caption><para id="fig.undo.backout.caption">Backing out
368 a change using the
369 <command role="hg-cmd">hg backout</command>
370 command</para></caption>
371 </mediaobject>
372 </informalfigure>
373
374 </sect2>
375 <sect2>
376 <title>Backing out a non-tip change</title>
377
378 <para>If you want to back out a change other than the last one
379 you committed, pass the <option
380 role="hg-opt-backout">--merge</option> option to the
381 <command role="hg-cmd">hg backout</command> command.</para>
382
383 &interaction.backout.non-tip.clone;
384
385 <para>This makes backing out any changeset a
386 <quote>one-shot</quote> operation that's usually simple and
387 fast.</para>
388
389 &interaction.backout.non-tip.backout;
390
391 <para>If you take a look at the contents of
392 <filename>myfile</filename> after the backout finishes, you'll
393 see that the first and third changes are present, but not the
394 second.</para>
395
396 &interaction.backout.non-tip.cat;
397
398 <para>As the graphical history in figure <xref
399 endterm="fig.undo.backout-non-tip.caption"
400 linkend="fig.undo.backout-non-tip"/> illustrates, Mercurial
401 actually commits <emphasis>two</emphasis> changes in this kind
402 of situation (the box-shaped nodes are the ones that Mercurial
403 commits automatically). Before Mercurial begins the backout
404 process, it first remembers what the current parent of the
405 working directory is. It then backs out the target changeset,
406 and commits that as a changeset. Finally, it merges back to
407 the previous parent of the working directory, and commits the
408 result of the merge.</para>
409
410 <para>% TODO: to me it looks like mercurial doesn't commit the
411 second merge automatically!</para>
412
413 <informalfigure id="fig.undo.backout-non-tip">
414 <mediaobject>
415 <imageobject><imagedata fileref="images/undo-non-tip.png"/>
416 </imageobject>
417 <textobject><phrase>XXX add text</phrase></textobject>
418 <caption><para id="fig.undo.backout-non-tip.caption">Automated
419 backout of a non-tip change using the
420 <command role="hg-cmd">hg backout</command> command</para></caption>
421 </mediaobject>
422 </informalfigure>
423
424 <para>The result is that you end up <quote>back where you
425 were</quote>, only with some extra history that undoes the
426 effect of the changeset you wanted to back out.</para>
427
428 <sect3>
429 <title>Always use the <option
430 role="hg-opt-backout">--merge</option> option</title>
431
432 <para>In fact, since the <option
433 role="hg-opt-backout">--merge</option> option will do the
434 <quote>right thing</quote> whether or not the changeset
435 you're backing out is the tip (i.e. it won't try to merge if
436 it's backing out the tip, since there's no need), you should
437 <emphasis>always</emphasis> use this option when you run the
438 <command role="hg-cmd">hg backout</command> command.</para>
439
440 </sect3>
441 </sect2>
442 <sect2>
443 <title>Gaining more control of the backout process</title>
444
445 <para>While I've recommended that you always use the <option
446 role="hg-opt-backout">--merge</option> option when backing
447 out a change, the <command role="hg-cmd">hg backout</command>
448 command lets you decide how to merge a backout changeset.
449 Taking control of the backout process by hand is something you
450 will rarely need to do, but it can be useful to understand
451 what the <command role="hg-cmd">hg backout</command> command
452 is doing for you automatically. To illustrate this, let's
453 clone our first repository, but omit the backout change that
454 it contains.</para>
455
456 &interaction.backout.manual.clone;
457
458 <para>As with our
459 earlier example, We'll commit a third changeset, then back out
460 its parent, and see what happens.</para>
461
462 &interaction.backout.manual.backout;
463
464 <para>Our new changeset is again a descendant of the changeset
465 we backout out; it's thus a new head, <emphasis>not</emphasis>
466 a descendant of the changeset that was the tip. The <command
467 role="hg-cmd">hg backout</command> command was quite
468 explicit in telling us this.</para>
469
470 &interaction.backout.manual.log;
471
472 <para>Again, it's easier to see what has happened by looking at
473 a graph of the revision history, in figure <xref
474 endterm="fig.undo.backout-manual.caption"
475 linkend="fig.undo.backout-manual"/>. This makes it clear
476 that when we use <command role="hg-cmd">hg backout</command>
477 to back out a change other than the tip, Mercurial adds a new
478 head to the repository (the change it committed is
479 box-shaped).</para>
480
481 <informalfigure id="fig.undo.backout-manual">
482 <mediaobject>
483 <imageobject><imagedata fileref="images/undo-manual.png"/>
484 </imageobject>
485 <textobject><phrase>XXX add text</phrase></textobject>
486 <caption><para id="fig.undo.backout-manual.caption">Backing out a
487 change using the <command role="hg-cmd">hg backout</command>
488 command</para></caption>
489 </mediaobject>
490 </informalfigure>
491
492 <para>After the <command role="hg-cmd">hg backout</command>
493 command has completed, it leaves the new
494 <quote>backout</quote> changeset as the parent of the working
495 directory.</para>
496
497 &interaction.backout.manual.parents;
498
499 <para>Now we have two isolated sets of changes.</para>
500
501 &interaction.backout.manual.heads;
502
503 <para>Let's think about what we expect to see as the contents of
504 <filename>myfile</filename> now. The first change should be
505 present, because we've never backed it out. The second change
506 should be missing, as that's the change we backed out. Since
507 the history graph shows the third change as a separate head,
508 we <emphasis>don't</emphasis> expect to see the third change
509 present in <filename>myfile</filename>.</para>
510
511 &interaction.backout.manual.cat;
512
513 <para>To get the third change back into the file, we just do a
514 normal merge of our two heads.</para>
515
516 &interaction.backout.manual.merge;
517
518 <para>Afterwards, the graphical history of our repository looks
519 like figure
520 <xref endterm="fig.undo.backout-manual-merge.caption"
521 linkend="fig.undo.backout-manual-merge"/>.</para>
522
523 <informalfigure id="fig.undo.backout-manual-merge">
524 <mediaobject>
525 <imageobject><imagedata fileref="images/undo-manual-merge.png"/>
526 </imageobject>
527 <textobject><phrase>XXX add text</phrase></textobject>
528 <caption><para id="fig.undo.backout-manual-merge.caption">Manually
529 merging a backout change</para></caption>
530 </mediaobject>
531 </informalfigure>
532
533 </sect2>
534 <sect2>
535 <title>Why <command role="hg-cmd">hg backout</command> works as
536 it does</title>
537
538 <para>Here's a brief description of how the <command
539 role="hg-cmd">hg backout</command> command works.</para>
540 <orderedlist>
541 <listitem><para>It ensures that the working directory is
542 <quote>clean</quote>, i.e. that the output of <command
543 role="hg-cmd">hg status</command> would be empty.</para>
544 </listitem>
545 <listitem><para>It remembers the current parent of the working
546 directory. Let's call this changeset
547 <literal>orig</literal></para>
548 </listitem>
549 <listitem><para>It does the equivalent of a <command
550 role="hg-cmd">hg update</command> to sync the working
551 directory to the changeset you want to back out. Let's
552 call this changeset <literal>backout</literal></para>
553 </listitem>
554 <listitem><para>It finds the parent of that changeset. Let's
555 call that changeset <literal>parent</literal>.</para>
556 </listitem>
557 <listitem><para>For each file that the
558 <literal>backout</literal> changeset affected, it does the
559 equivalent of a <command role="hg-cmd">hg revert -r
560 parent</command> on that file, to restore it to the
561 contents it had before that changeset was
562 committed.</para>
563 </listitem>
564 <listitem><para>It commits the result as a new changeset.
565 This changeset has <literal>backout</literal> as its
566 parent.</para>
567 </listitem>
568 <listitem><para>If you specify <option
569 role="hg-opt-backout">--merge</option> on the command
570 line, it merges with <literal>orig</literal>, and commits
571 the result of the merge.</para>
572 </listitem></orderedlist>
573
574 <para>An alternative way to implement the <command
575 role="hg-cmd">hg backout</command> command would be to
576 <command role="hg-cmd">hg export</command> the
577 to-be-backed-out changeset as a diff, then use the <option
578 role="cmd-opt-patch">--reverse</option> option to the
579 <command>patch</command> command to reverse the effect of the
580 change without fiddling with the working directory. This
581 sounds much simpler, but it would not work nearly as
582 well.</para>
583
584 <para>The reason that <command role="hg-cmd">hg
585 backout</command> does an update, a commit, a merge, and
586 another commit is to give the merge machinery the best chance
587 to do a good job when dealing with all the changes
588 <emphasis>between</emphasis> the change you're backing out and
589 the current tip.</para>
590
591 <para>If you're backing out a changeset that's 100 revisions
592 back in your project's history, the chances that the
593 <command>patch</command> command will be able to apply a
594 reverse diff cleanly are not good, because intervening changes
595 are likely to have <quote>broken the context</quote> that
596 <command>patch</command> uses to determine whether it can
597 apply a patch (if this sounds like gibberish, see <xref
598 linkend="sec.mq.patch"/> for a
599 discussion of the <command>patch</command> command). Also,
600 Mercurial's merge machinery will handle files and directories
601 being renamed, permission changes, and modifications to binary
602 files, none of which <command>patch</command> can deal
603 with.</para>
604
605 </sect2>
606 </sect1>
607 <sect1 id="sec.undo.aaaiiieee">
608 <title>Changes that should never have been</title>
609
610 <para>Most of the time, the <command role="hg-cmd">hg
611 backout</command> command is exactly what you need if you want
612 to undo the effects of a change. It leaves a permanent record
613 of exactly what you did, both when committing the original
614 changeset and when you cleaned up after it.</para>
615
616 <para>On rare occasions, though, you may find that you've
617 committed a change that really should not be present in the
618 repository at all. For example, it would be very unusual, and
619 usually considered a mistake, to commit a software project's
620 object files as well as its source files. Object files have
621 almost no intrinsic value, and they're <emphasis>big</emphasis>,
622 so they increase the size of the repository and the amount of
623 time it takes to clone or pull changes.</para>
624
625 <para>Before I discuss the options that you have if you commit a
626 <quote>brown paper bag</quote> change (the kind that's so bad
627 that you want to pull a brown paper bag over your head), let me
628 first discuss some approaches that probably won't work.</para>
629
630 <para>Since Mercurial treats history as accumulative&emdash;every
631 change builds on top of all changes that preceded it&emdash;you
632 generally can't just make disastrous changes disappear. The one
633 exception is when you've just committed a change, and it hasn't
634 been pushed or pulled into another repository. That's when you
635 can safely use the <command role="hg-cmd">hg rollback</command>
636 command, as I detailed in section <xref
637 linkend="sec.undo.rollback"/>.</para>
638
639 <para>After you've pushed a bad change to another repository, you
640 <emphasis>could</emphasis> still use <command role="hg-cmd">hg
641 rollback</command> to make your local copy of the change
642 disappear, but it won't have the consequences you want. The
643 change will still be present in the remote repository, so it
644 will reappear in your local repository the next time you
645 pull.</para>
646
647 <para>If a situation like this arises, and you know which
648 repositories your bad change has propagated into, you can
649 <emphasis>try</emphasis> to get rid of the changeefrom
650 <emphasis>every</emphasis> one of those repositories. This is,
651 of course, not a satisfactory solution: if you miss even a
652 single repository while you're expunging, the change is still
653 <quote>in the wild</quote>, and could propagate further.</para>
654
655 <para>If you've committed one or more changes
656 <emphasis>after</emphasis> the change that you'd like to see
657 disappear, your options are further reduced. Mercurial doesn't
658 provide a way to <quote>punch a hole</quote> in history, leaving
659 changesets intact.</para>
660
661 <para>XXX This needs filling out. The
662 <literal>hg-replay</literal> script in the
663 <literal>examples</literal> directory works, but doesn't handle
664 merge changesets. Kind of an important omission.</para>
665
666 <sect2>
667 <title>Protect yourself from <quote>escaped</quote>
668 changes</title>
669
670 <para>If you've committed some changes to your local repository
671 and they've been pushed or pulled somewhere else, this isn't
672 necessarily a disaster. You can protect yourself ahead of
673 time against some classes of bad changeset. This is
674 particularly easy if your team usually pulls changes from a
675 central repository.</para>
676
677 <para>By configuring some hooks on that repository to validate
678 incoming changesets (see chapter <xref linkend="chap.hook"/>),
679 you can
680 automatically prevent some kinds of bad changeset from being
681 pushed to the central repository at all. With such a
682 configuration in place, some kinds of bad changeset will
683 naturally tend to <quote>die out</quote> because they can't
684 propagate into the central repository. Better yet, this
685 happens without any need for explicit intervention.</para>
686
687 <para>For instance, an incoming change hook that verifies that a
688 changeset will actually compile can prevent people from
689 inadvertantly <quote>breaking the build</quote>.</para>
690
691 </sect2>
692 </sect1>
693 <sect1 id="sec.undo.bisect">
694 <title>Finding the source of a bug</title>
695
696 <para>While it's all very well to be able to back out a changeset
697 that introduced a bug, this requires that you know which
698 changeset to back out. Mercurial provides an invaluable
699 command, called <command role="hg-cmd">hg bisect</command>, that
700 helps you to automate this process and accomplish it very
701 efficiently.</para>
702
703 <para>The idea behind the <command role="hg-cmd">hg
704 bisect</command> command is that a changeset has introduced
705 some change of behaviour that you can identify with a simple
706 binary test. You don't know which piece of code introduced the
707 change, but you know how to test for the presence of the bug.
708 The <command role="hg-cmd">hg bisect</command> command uses your
709 test to direct its search for the changeset that introduced the
710 code that caused the bug.</para>
711
712 <para>Here are a few scenarios to help you understand how you
713 might apply this command.</para>
714 <itemizedlist>
715 <listitem><para>The most recent version of your software has a
716 bug that you remember wasn't present a few weeks ago, but
717 you don't know when it was introduced. Here, your binary
718 test checks for the presence of that bug.</para>
719 </listitem>
720 <listitem><para>You fixed a bug in a rush, and now it's time to
721 close the entry in your team's bug database. The bug
722 database requires a changeset ID when you close an entry,
723 but you don't remember which changeset you fixed the bug in.
724 Once again, your binary test checks for the presence of the
725 bug.</para>
726 </listitem>
727 <listitem><para>Your software works correctly, but runs 15%
728 slower than the last time you measured it. You want to know
729 which changeset introduced the performance regression. In
730 this case, your binary test measures the performance of your
731 software, to see whether it's <quote>fast</quote> or
732 <quote>slow</quote>.</para>
733 </listitem>
734 <listitem><para>The sizes of the components of your project that
735 you ship exploded recently, and you suspect that something
736 changed in the way you build your project.</para>
737 </listitem></itemizedlist>
738
739 <para>From these examples, it should be clear that the <command
740 role="hg-cmd">hg bisect</command> command is not useful only
741 for finding the sources of bugs. You can use it to find any
742 <quote>emergent property</quote> of a repository (anything that
743 you can't find from a simple text search of the files in the
744 tree) for which you can write a binary test.</para>
745
746 <para>We'll introduce a little bit of terminology here, just to
747 make it clear which parts of the search process are your
748 responsibility, and which are Mercurial's. A
749 <emphasis>test</emphasis> is something that
750 <emphasis>you</emphasis> run when <command role="hg-cmd">hg
751 bisect</command> chooses a changeset. A
752 <emphasis>probe</emphasis> is what <command role="hg-cmd">hg
753 bisect</command> runs to tell whether a revision is good.
754 Finally, we'll use the word <quote>bisect</quote>, as both a
755 noun and a verb, to stand in for the phrase <quote>search using
756 the <command role="hg-cmd">hg bisect</command>
757 command</quote>.</para>
758
759 <para>One simple way to automate the searching process would be
760 simply to probe every changeset. However, this scales poorly.
761 If it took ten minutes to test a single changeset, and you had
762 10,000 changesets in your repository, the exhaustive approach
763 would take on average 35 <emphasis>days</emphasis> to find the
764 changeset that introduced a bug. Even if you knew that the bug
765 was introduced by one of the last 500 changesets, and limited
766 your search to those, you'd still be looking at over 40 hours to
767 find the changeset that introduced your bug.</para>
768
769 <para>What the <command role="hg-cmd">hg bisect</command> command
770 does is use its knowledge of the <quote>shape</quote> of your
771 project's revision history to perform a search in time
772 proportional to the <emphasis>logarithm</emphasis> of the number
773 of changesets to check (the kind of search it performs is called
774 a dichotomic search). With this approach, searching through
775 10,000 changesets will take less than three hours, even at ten
776 minutes per test (the search will require about 14 tests).
777 Limit your search to the last hundred changesets, and it will
778 take only about an hour (roughly seven tests).</para>
779
780 <para>The <command role="hg-cmd">hg bisect</command> command is
781 aware of the <quote>branchy</quote> nature of a Mercurial
782 project's revision history, so it has no problems dealing with
783 branches, merges, or multiple heads in a repository. It can
784 prune entire branches of history with a single probe, which is
785 how it operates so efficiently.</para>
786
787 <sect2>
788 <title>Using the <command role="hg-cmd">hg bisect</command>
789 command</title>
790
791 <para>Here's an example of <command role="hg-cmd">hg
792 bisect</command> in action.</para>
793
794 <note>
795 <para> In versions 0.9.5 and earlier of Mercurial, <command
796 role="hg-cmd">hg bisect</command> was not a core command:
797 it was distributed with Mercurial as an extension. This
798 section describes the built-in command, not the old
799 extension.</para>
800 </note>
801
802 <para>Now let's create a repository, so that we can try out the
803 <command role="hg-cmd">hg bisect</command> command in
804 isolation.</para>
805
806 &interaction.bisect.init;
807
808 <para>We'll simulate a project that has a bug in it in a
809 simple-minded way: create trivial changes in a loop, and
810 nominate one specific change that will have the
811 <quote>bug</quote>. This loop creates 35 changesets, each
812 adding a single file to the repository. We'll represent our
813 <quote>bug</quote> with a file that contains the text <quote>i
814 have a gub</quote>.</para>
815
816 &interaction.bisect.commits;
817
818 <para>The next thing that we'd like to do is figure out how to
819 use the <command role="hg-cmd">hg bisect</command> command.
820 We can use Mercurial's normal built-in help mechanism for
821 this.</para>
822
823 &interaction.bisect.help;
824
825 <para>The <command role="hg-cmd">hg bisect</command> command
826 works in steps. Each step proceeds as follows.</para>
827 <orderedlist>
828 <listitem><para>You run your binary test.</para>
829 <itemizedlist>
830 <listitem><para>If the test succeeded, you tell <command
831 role="hg-cmd">hg bisect</command> by running the
832 <command role="hg-cmd">hg bisect good</command>
833 command.</para>
834 </listitem>
835 <listitem><para>If it failed, run the <command
836 role="hg-cmd">hg bisect bad</command>
837 command.</para></listitem></itemizedlist>
838 </listitem>
839 <listitem><para>The command uses your information to decide
840 which changeset to test next.</para>
841 </listitem>
842 <listitem><para>It updates the working directory to that
843 changeset, and the process begins again.</para>
844 </listitem></orderedlist>
845 <para>The process ends when <command role="hg-cmd">hg
846 bisect</command> identifies a unique changeset that marks
847 the point where your test transitioned from
848 <quote>succeeding</quote> to <quote>failing</quote>.</para>
849
850 <para>To start the search, we must run the <command
851 role="hg-cmd">hg bisect --reset</command> command.</para>
852
853 &interaction.bisect.search.init;
854
855 <para>In our case, the binary test we use is simple: we check to
856 see if any file in the repository contains the string <quote>i
857 have a gub</quote>. If it does, this changeset contains the
858 change that <quote>caused the bug</quote>. By convention, a
859 changeset that has the property we're searching for is
860 <quote>bad</quote>, while one that doesn't is
861 <quote>good</quote>.</para>
862
863 <para>Most of the time, the revision to which the working
864 directory is synced (usually the tip) already exhibits the
865 problem introduced by the buggy change, so we'll mark it as
866 <quote>bad</quote>.</para>
867
868 &interaction.bisect.search.bad-init;
869
870 <para>Our next task is to nominate a changeset that we know
871 <emphasis>doesn't</emphasis> have the bug; the <command
872 role="hg-cmd">hg bisect</command> command will
873 <quote>bracket</quote> its search between the first pair of
874 good and bad changesets. In our case, we know that revision
875 10 didn't have the bug. (I'll have more words about choosing
876 the first <quote>good</quote> changeset later.)</para>
877
878 &interaction.bisect.search.good-init;
879
880 <para>Notice that this command printed some output.</para>
881 <itemizedlist>
882 <listitem><para>It told us how many changesets it must
883 consider before it can identify the one that introduced
884 the bug, and how many tests that will require.</para>
885 </listitem>
886 <listitem><para>It updated the working directory to the next
887 changeset to test, and told us which changeset it's
888 testing.</para>
889 </listitem></itemizedlist>
890
891 <para>We now run our test in the working directory. We use the
892 <command>grep</command> command to see if our
893 <quote>bad</quote> file is present in the working directory.
894 If it is, this revision is bad; if not, this revision is good.
895 &interaction.bisect.search.step1;</para>
896
897 <para>This test looks like a perfect candidate for automation,
898 so let's turn it into a shell function.</para>
899 &interaction.bisect.search.mytest;
900
901 <para>We can now run an entire test step with a single command,
902 <literal>mytest</literal>.</para>
903
904 &interaction.bisect.search.step2;
905
906 <para>A few more invocations of our canned test step command,
907 and we're done.</para>
908
909 &interaction.bisect.search.rest;
910
911 <para>Even though we had 40 changesets to search through, the
912 <command role="hg-cmd">hg bisect</command> command let us find
913 the changeset that introduced our <quote>bug</quote> with only
914 five tests. Because the number of tests that the <command
915 role="hg-cmd">hg bisect</command> command performs grows
916 logarithmically with the number of changesets to search, the
917 advantage that it has over the <quote>brute force</quote>
918 search approach increases with every changeset you add.</para>
919
920 </sect2>
921 <sect2>
922 <title>Cleaning up after your search</title>
923
924 <para>When you're finished using the <command role="hg-cmd">hg
925 bisect</command> command in a repository, you can use the
926 <command role="hg-cmd">hg bisect reset</command> command to
927 drop the information it was using to drive your search. The
928 command doesn't use much space, so it doesn't matter if you
929 forget to run this command. However, <command
930 role="hg-cmd">hg bisect</command> won't let you start a new
931 search in that repository until you do a <command
932 role="hg-cmd">hg bisect reset</command>.</para>
933
934 &interaction.bisect.search.reset;
935
936 </sect2>
937 </sect1>
938 <sect1>
939 <title>Tips for finding bugs effectively</title>
940
941 <sect2>
942 <title>Give consistent input</title>
943
944 <para>The <command role="hg-cmd">hg bisect</command> command
945 requires that you correctly report the result of every test
946 you perform. If you tell it that a test failed when it really
947 succeeded, it <emphasis>might</emphasis> be able to detect the
948 inconsistency. If it can identify an inconsistency in your
949 reports, it will tell you that a particular changeset is both
950 good and bad. However, it can't do this perfectly; it's about
951 as likely to report the wrong changeset as the source of the
952 bug.</para>
953
954 </sect2>
955 <sect2>
956 <title>Automate as much as possible</title>
957
958 <para>When I started using the <command role="hg-cmd">hg
959 bisect</command> command, I tried a few times to run my
960 tests by hand, on the command line. This is an approach that
961 I, at least, am not suited to. After a few tries, I found
962 that I was making enough mistakes that I was having to restart
963 my searches several times before finally getting correct
964 results.</para>
965
966 <para>My initial problems with driving the <command
967 role="hg-cmd">hg bisect</command> command by hand occurred
968 even with simple searches on small repositories; if the
969 problem you're looking for is more subtle, or the number of
970 tests that <command role="hg-cmd">hg bisect</command> must
971 perform increases, the likelihood of operator error ruining
972 the search is much higher. Once I started automating my
973 tests, I had much better results.</para>
974
975 <para>The key to automated testing is twofold:</para>
976 <itemizedlist>
977 <listitem><para>always test for the same symptom, and</para>
978 </listitem>
979 <listitem><para>always feed consistent input to the <command
980 role="hg-cmd">hg bisect</command> command.</para>
981 </listitem></itemizedlist>
982 <para>In my tutorial example above, the <command>grep</command>
983 command tests for the symptom, and the <literal>if</literal>
984 statement takes the result of this check and ensures that we
985 always feed the same input to the <command role="hg-cmd">hg
986 bisect</command> command. The <literal>mytest</literal>
987 function marries these together in a reproducible way, so that
988 every test is uniform and consistent.</para>
989
990 </sect2>
991 <sect2>
992 <title>Check your results</title>
993
994 <para>Because the output of a <command role="hg-cmd">hg
995 bisect</command> search is only as good as the input you
996 give it, don't take the changeset it reports as the absolute
997 truth. A simple way to cross-check its report is to manually
998 run your test at each of the following changesets:</para>
999 <itemizedlist>
1000 <listitem><para>The changeset that it reports as the first bad
1001 revision. Your test should still report this as
1002 bad.</para>
1003 </listitem>
1004 <listitem><para>The parent of that changeset (either parent,
1005 if it's a merge). Your test should report this changeset
1006 as good.</para>
1007 </listitem>
1008 <listitem><para>A child of that changeset. Your test should
1009 report this changeset as bad.</para>
1010 </listitem></itemizedlist>
1011
1012 </sect2>
1013 <sect2>
1014 <title>Beware interference between bugs</title>
1015
1016 <para>It's possible that your search for one bug could be
1017 disrupted by the presence of another. For example, let's say
1018 your software crashes at revision 100, and worked correctly at
1019 revision 50. Unknown to you, someone else introduced a
1020 different crashing bug at revision 60, and fixed it at
1021 revision 80. This could distort your results in one of
1022 several ways.</para>
1023
1024 <para>It is possible that this other bug completely
1025 <quote>masks</quote> yours, which is to say that it occurs
1026 before your bug has a chance to manifest itself. If you can't
1027 avoid that other bug (for example, it prevents your project
1028 from building), and so can't tell whether your bug is present
1029 in a particular changeset, the <command role="hg-cmd">hg
1030 bisect</command> command cannot help you directly. Instead,
1031 you can mark a changeset as untested by running <command
1032 role="hg-cmd">hg bisect --skip</command>.</para>
1033
1034 <para>A different problem could arise if your test for a bug's
1035 presence is not specific enough. If you check for <quote>my
1036 program crashes</quote>, then both your crashing bug and an
1037 unrelated crashing bug that masks it will look like the same
1038 thing, and mislead <command role="hg-cmd">hg
1039 bisect</command>.</para>
1040
1041 <para>Another useful situation in which to use <command
1042 role="hg-cmd">hg bisect --skip</command> is if you can't
1043 test a revision because your project was in a broken and hence
1044 untestable state at that revision, perhaps because someone
1045 checked in a change that prevented the project from
1046 building.</para>
1047
1048 </sect2>
1049 <sect2>
1050 <title>Bracket your search lazily</title>
1051
1052 <para>Choosing the first <quote>good</quote> and
1053 <quote>bad</quote> changesets that will mark the end points of
1054 your search is often easy, but it bears a little discussion
1055 nevertheless. From the perspective of <command
1056 role="hg-cmd">hg bisect</command>, the <quote>newest</quote>
1057 changeset is conventionally <quote>bad</quote>, and the older
1058 changeset is <quote>good</quote>.</para>
1059
1060 <para>If you're having trouble remembering when a suitable
1061 <quote>good</quote> change was, so that you can tell <command
1062 role="hg-cmd">hg bisect</command>, you could do worse than
1063 testing changesets at random. Just remember to eliminate
1064 contenders that can't possibly exhibit the bug (perhaps
1065 because the feature with the bug isn't present yet) and those
1066 where another problem masks the bug (as I discussed
1067 above).</para>
1068
1069 <para>Even if you end up <quote>early</quote> by thousands of
1070 changesets or months of history, you will only add a handful
1071 of tests to the total number that <command role="hg-cmd">hg
1072 bisect</command> must perform, thanks to its logarithmic
1073 behaviour.</para>
1074
1075 </sect2>
1076 </sect1>
1077 </chapter>
1078
1079 <!--
1080 local variables:
1081 sgml-parent-document: ("00book.xml" "book" "chapter")
1082 end:
1083 -->