comparison en/ch01-intro.xml @ 831:acf9dc5f088d

Add a skeletal preface.
author Bryan O'Sullivan <bos@serpentine.com>
date Thu, 07 May 2009 21:07:35 -0700
parents en/ch00-preface.xml@b338f5490029
children
comparison
equal deleted inserted replaced
830:cbdff5945f9d 831:acf9dc5f088d
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
2
3 <chapter id="chap:intro">
4 <?dbhtml filename="how-did-we-get-here.html"?>
5 <title>How did we get here?</title>
6
7 <sect1>
8 <title>Why revision control? Why Mercurial?</title>
9
10 <para id="x_6d">Revision control is the process of managing multiple
11 versions of a piece of information. In its simplest form, this
12 is something that many people do by hand: every time you modify
13 a file, save it under a new name that contains a number, each
14 one higher than the number of the preceding version.</para>
15
16 <para id="x_6e">Manually managing multiple versions of even a single file is
17 an error-prone task, though, so software tools to help automate
18 this process have long been available. The earliest automated
19 revision control tools were intended to help a single user to
20 manage revisions of a single file. Over the past few decades,
21 the scope of revision control tools has expanded greatly; they
22 now manage multiple files, and help multiple people to work
23 together. The best modern revision control tools have no
24 problem coping with thousands of people working together on
25 projects that consist of hundreds of thousands of files.</para>
26
27 <para id="x_6f">The arrival of distributed revision control is relatively
28 recent, and so far this new field has grown due to people's
29 willingness to explore ill-charted territory.</para>
30
31 <para id="x_70">I am writing a book about distributed revision control
32 because I believe that it is an important subject that deserves
33 a field guide. I chose to write about Mercurial because it is
34 the easiest tool to learn the terrain with, and yet it scales to
35 the demands of real, challenging environments where many other
36 revision control tools buckle.</para>
37
38 <sect2>
39 <title>Why use revision control?</title>
40
41 <para id="x_71">There are a number of reasons why you or your team might
42 want to use an automated revision control tool for a
43 project.</para>
44
45 <itemizedlist>
46 <listitem><para id="x_72">It will track the history and evolution of
47 your project, so you don't have to. For every change,
48 you'll have a log of <emphasis>who</emphasis> made it;
49 <emphasis>why</emphasis> they made it;
50 <emphasis>when</emphasis> they made it; and
51 <emphasis>what</emphasis> the change
52 was.</para></listitem>
53 <listitem><para id="x_73">When you're working with other people,
54 revision control software makes it easier for you to
55 collaborate. For example, when people more or less
56 simultaneously make potentially incompatible changes, the
57 software will help you to identify and resolve those
58 conflicts.</para></listitem>
59 <listitem><para id="x_74">It can help you to recover from mistakes. If
60 you make a change that later turns out to be in error, you
61 can revert to an earlier version of one or more files. In
62 fact, a <emphasis>really</emphasis> good revision control
63 tool will even help you to efficiently figure out exactly
64 when a problem was introduced (see <xref
65 linkend="sec:undo:bisect"/> for details).</para></listitem>
66 <listitem><para id="x_75">It will help you to work simultaneously on,
67 and manage the drift between, multiple versions of your
68 project.</para></listitem>
69 </itemizedlist>
70
71 <para id="x_76">Most of these reasons are equally
72 valid&emdash;at least in theory&emdash;whether you're working
73 on a project by yourself, or with a hundred other
74 people.</para>
75
76 <para id="x_77">A key question about the practicality of revision control
77 at these two different scales (<quote>lone hacker</quote> and
78 <quote>huge team</quote>) is how its
79 <emphasis>benefits</emphasis> compare to its
80 <emphasis>costs</emphasis>. A revision control tool that's
81 difficult to understand or use is going to impose a high
82 cost.</para>
83
84 <para id="x_78">A five-hundred-person project is likely to collapse under
85 its own weight almost immediately without a revision control
86 tool and process. In this case, the cost of using revision
87 control might hardly seem worth considering, since
88 <emphasis>without</emphasis> it, failure is almost
89 guaranteed.</para>
90
91 <para id="x_79">On the other hand, a one-person <quote>quick hack</quote>
92 might seem like a poor place to use a revision control tool,
93 because surely the cost of using one must be close to the
94 overall cost of the project. Right?</para>
95
96 <para id="x_7a">Mercurial uniquely supports <emphasis>both</emphasis> of
97 these scales of development. You can learn the basics in just
98 a few minutes, and due to its low overhead, you can apply
99 revision control to the smallest of projects with ease. Its
100 simplicity means you won't have a lot of abstruse concepts or
101 command sequences competing for mental space with whatever
102 you're <emphasis>really</emphasis> trying to do. At the same
103 time, Mercurial's high performance and peer-to-peer nature let
104 you scale painlessly to handle large projects.</para>
105
106 <para id="x_7b">No revision control tool can rescue a poorly run project,
107 but a good choice of tools can make a huge difference to the
108 fluidity with which you can work on a project.</para>
109
110 </sect2>
111
112 <sect2>
113 <title>The many names of revision control</title>
114
115 <para id="x_7c">Revision control is a diverse field, so much so that it is
116 referred to by many names and acronyms. Here are a few of the
117 more common variations you'll encounter:</para>
118 <itemizedlist>
119 <listitem><para id="x_7d">Revision control (RCS)</para></listitem>
120 <listitem><para id="x_7e">Software configuration management (SCM), or
121 configuration management</para></listitem>
122 <listitem><para id="x_7f">Source code management</para></listitem>
123 <listitem><para id="x_80">Source code control, or source
124 control</para></listitem>
125 <listitem><para id="x_81">Version control
126 (VCS)</para></listitem></itemizedlist>
127 <para id="x_82">Some people claim that these terms actually have different
128 meanings, but in practice they overlap so much that there's no
129 agreed or even useful way to tease them apart.</para>
130
131 </sect2>
132 </sect1>
133
134 <sect1>
135 <title>About the examples in this book</title>
136
137 <para id="x_84">This book takes an unusual approach to code samples. Every
138 example is <quote>live</quote>&emdash;each one is actually the result
139 of a shell script that executes the Mercurial commands you see.
140 Every time an image of the book is built from its sources, all
141 the example scripts are automatically run, and their current
142 results compared against their expected results.</para>
143
144 <para id="x_85">The advantage of this approach is that the examples are
145 always accurate; they describe <emphasis>exactly</emphasis> the
146 behavior of the version of Mercurial that's mentioned at the
147 front of the book. If I update the version of Mercurial that
148 I'm documenting, and the output of some command changes, the
149 build fails.</para>
150
151 <para id="x_86">There is a small disadvantage to this approach, which is
152 that the dates and times you'll see in examples tend to be
153 <quote>squashed</quote> together in a way that they wouldn't be
154 if the same commands were being typed by a human. Where a human
155 can issue no more than one command every few seconds, with any
156 resulting timestamps correspondingly spread out, my automated
157 example scripts run many commands in one second.</para>
158
159 <para id="x_87">As an instance of this, several consecutive commits in an
160 example can show up as having occurred during the same second.
161 You can see this occur in the <literal
162 role="hg-ext">bisect</literal> example in <xref
163 linkend="sec:undo:bisect"/>, for instance.</para>
164
165 <para id="x_88">So when you're reading examples, don't place too much weight
166 on the dates or times you see in the output of commands. But
167 <emphasis>do</emphasis> be confident that the behavior you're
168 seeing is consistent and reproducible.</para>
169
170 </sect1>
171
172 <sect1>
173 <title>Trends in the field</title>
174
175 <para id="x_89">There has been an unmistakable trend in the development and
176 use of revision control tools over the past four decades, as
177 people have become familiar with the capabilities of their tools
178 and constrained by their limitations.</para>
179
180 <para id="x_8a">The first generation began by managing single files on
181 individual computers. Although these tools represented a huge
182 advance over ad-hoc manual revision control, their locking model
183 and reliance on a single computer limited them to small,
184 tightly-knit teams.</para>
185
186 <para id="x_8b">The second generation loosened these constraints by moving
187 to network-centered architectures, and managing entire projects
188 at a time. As projects grew larger, they ran into new problems.
189 With clients needing to talk to servers very frequently, server
190 scaling became an issue for large projects. An unreliable
191 network connection could prevent remote users from being able to
192 talk to the server at all. As open source projects started
193 making read-only access available anonymously to anyone, people
194 without commit privileges found that they could not use the
195 tools to interact with a project in a natural way, as they could
196 not record their changes.</para>
197
198 <para id="x_8c">The current generation of revision control tools is
199 peer-to-peer in nature. All of these systems have dropped the
200 dependency on a single central server, and allow people to
201 distribute their revision control data to where it's actually
202 needed. Collaboration over the Internet has moved from
203 constrained by technology to a matter of choice and consensus.
204 Modern tools can operate offline indefinitely and autonomously,
205 with a network connection only needed when syncing changes with
206 another repository.</para>
207
208 </sect1>
209 <sect1>
210 <title>A few of the advantages of distributed revision
211 control</title>
212
213 <para id="x_8d">Even though distributed revision control tools have for
214 several years been as robust and usable as their
215 previous-generation counterparts, people using older tools have
216 not yet necessarily woken up to their advantages. There are a
217 number of ways in which distributed tools shine relative to
218 centralised ones.</para>
219
220 <para id="x_8e">For an individual developer, distributed tools are almost
221 always much faster than centralised tools. This is for a simple
222 reason: a centralised tool needs to talk over the network for
223 many common operations, because most metadata is stored in a
224 single copy on the central server. A distributed tool stores
225 all of its metadata locally. All else being equal, talking over
226 the network adds overhead to a centralised tool. Don't
227 underestimate the value of a snappy, responsive tool: you're
228 going to spend a lot of time interacting with your revision
229 control software.</para>
230
231 <para id="x_8f">Distributed tools are indifferent to the vagaries of your
232 server infrastructure, again because they replicate metadata to
233 so many locations. If you use a centralised system and your
234 server catches fire, you'd better hope that your backup media
235 are reliable, and that your last backup was recent and actually
236 worked. With a distributed tool, you have many backups
237 available on every contributor's computer.</para>
238
239 <para id="x_90">The reliability of your network will affect distributed
240 tools far less than it will centralised tools. You can't even
241 use a centralised tool without a network connection, except for
242 a few highly constrained commands. With a distributed tool, if
243 your network connection goes down while you're working, you may
244 not even notice. The only thing you won't be able to do is talk
245 to repositories on other computers, something that is relatively
246 rare compared with local operations. If you have a far-flung
247 team of collaborators, this may be significant.</para>
248
249 <sect2>
250 <title>Advantages for open source projects</title>
251
252 <para id="x_91">If you take a shine to an open source project and decide
253 that you would like to start hacking on it, and that project
254 uses a distributed revision control tool, you are at once a
255 peer with the people who consider themselves the
256 <quote>core</quote> of that project. If they publish their
257 repositories, you can immediately copy their project history,
258 start making changes, and record your work, using the same
259 tools in the same ways as insiders. By contrast, with a
260 centralised tool, you must use the software in a <quote>read
261 only</quote> mode unless someone grants you permission to
262 commit changes to their central server. Until then, you won't
263 be able to record changes, and your local modifications will
264 be at risk of corruption any time you try to update your
265 client's view of the repository.</para>
266
267 <sect3>
268 <title>The forking non-problem</title>
269
270 <para id="x_92">It has been suggested that distributed revision control
271 tools pose some sort of risk to open source projects because
272 they make it easy to <quote>fork</quote> the development of
273 a project. A fork happens when there are differences in
274 opinion or attitude between groups of developers that cause
275 them to decide that they can't work together any longer.
276 Each side takes a more or less complete copy of the
277 project's source code, and goes off in its own
278 direction.</para>
279
280 <para id="x_93">Sometimes the camps in a fork decide to reconcile their
281 differences. With a centralised revision control system, the
282 <emphasis>technical</emphasis> process of reconciliation is
283 painful, and has to be performed largely by hand. You have
284 to decide whose revision history is going to
285 <quote>win</quote>, and graft the other team's changes into
286 the tree somehow. This usually loses some or all of one
287 side's revision history.</para>
288
289 <para id="x_94">What distributed tools do with respect to forking is
290 they make forking the <emphasis>only</emphasis> way to
291 develop a project. Every single change that you make is
292 potentially a fork point. The great strength of this
293 approach is that a distributed revision control tool has to
294 be really good at <emphasis>merging</emphasis> forks,
295 because forks are absolutely fundamental: they happen all
296 the time.</para>
297
298 <para id="x_95">If every piece of work that everybody does, all the
299 time, is framed in terms of forking and merging, then what
300 the open source world refers to as a <quote>fork</quote>
301 becomes <emphasis>purely</emphasis> a social issue. If
302 anything, distributed tools <emphasis>lower</emphasis> the
303 likelihood of a fork:</para>
304 <itemizedlist>
305 <listitem><para id="x_96">They eliminate the social distinction that
306 centralised tools impose: that between insiders (people
307 with commit access) and outsiders (people
308 without).</para></listitem>
309 <listitem><para id="x_97">They make it easier to reconcile after a
310 social fork, because all that's involved from the
311 perspective of the revision control software is just
312 another merge.</para></listitem></itemizedlist>
313
314 <para id="x_98">Some people resist distributed tools because they want
315 to retain tight control over their projects, and they
316 believe that centralised tools give them this control.
317 However, if you're of this belief, and you publish your CVS
318 or Subversion repositories publicly, there are plenty of
319 tools available that can pull out your entire project's
320 history (albeit slowly) and recreate it somewhere that you
321 don't control. So while your control in this case is
322 illusory, you are forgoing the ability to fluidly
323 collaborate with whatever people feel compelled to mirror
324 and fork your history.</para>
325
326 </sect3>
327 </sect2>
328 <sect2>
329 <title>Advantages for commercial projects</title>
330
331 <para id="x_99">Many commercial projects are undertaken by teams that are
332 scattered across the globe. Contributors who are far from a
333 central server will see slower command execution and perhaps
334 less reliability. Commercial revision control systems attempt
335 to ameliorate these problems with remote-site replication
336 add-ons that are typically expensive to buy and cantankerous
337 to administer. A distributed system doesn't suffer from these
338 problems in the first place. Better yet, you can easily set
339 up multiple authoritative servers, say one per site, so that
340 there's no redundant communication between repositories over
341 expensive long-haul network links.</para>
342
343 <para id="x_9a">Centralised revision control systems tend to have
344 relatively low scalability. It's not unusual for an expensive
345 centralised system to fall over under the combined load of
346 just a few dozen concurrent users. Once again, the typical
347 response tends to be an expensive and clunky replication
348 facility. Since the load on a central server&emdash;if you have
349 one at all&emdash;is many times lower with a distributed tool
350 (because all of the data is replicated everywhere), a single
351 cheap server can handle the needs of a much larger team, and
352 replication to balance load becomes a simple matter of
353 scripting.</para>
354
355 <para id="x_9b">If you have an employee in the field, troubleshooting a
356 problem at a customer's site, they'll benefit from distributed
357 revision control. The tool will let them generate custom
358 builds, try different fixes in isolation from each other, and
359 search efficiently through history for the sources of bugs and
360 regressions in the customer's environment, all without needing
361 to connect to your company's network.</para>
362
363 </sect2>
364 </sect1>
365 <sect1>
366 <title>Why choose Mercurial?</title>
367
368 <para id="x_9c">Mercurial has a unique set of properties that make it a
369 particularly good choice as a revision control system.</para>
370 <itemizedlist>
371 <listitem><para id="x_9d">It is easy to learn and use.</para></listitem>
372 <listitem><para id="x_9e">It is lightweight.</para></listitem>
373 <listitem><para id="x_9f">It scales excellently.</para></listitem>
374 <listitem><para id="x_a0">It is easy to
375 customise.</para></listitem></itemizedlist>
376
377 <para id="x_a1">If you are at all familiar with revision control systems,
378 you should be able to get up and running with Mercurial in less
379 than five minutes. Even if not, it will take no more than a few
380 minutes longer. Mercurial's command and feature sets are
381 generally uniform and consistent, so you can keep track of a few
382 general rules instead of a host of exceptions.</para>
383
384 <para id="x_a2">On a small project, you can start working with Mercurial in
385 moments. Creating new changes and branches; transferring changes
386 around (whether locally or over a network); and history and
387 status operations are all fast. Mercurial attempts to stay
388 nimble and largely out of your way by combining low cognitive
389 overhead with blazingly fast operations.</para>
390
391 <para id="x_a3">The usefulness of Mercurial is not limited to small
392 projects: it is used by projects with hundreds to thousands of
393 contributors, each containing tens of thousands of files and
394 hundreds of megabytes of source code.</para>
395
396 <para id="x_a4">If the core functionality of Mercurial is not enough for
397 you, it's easy to build on. Mercurial is well suited to
398 scripting tasks, and its clean internals and implementation in
399 Python make it easy to add features in the form of extensions.
400 There are a number of popular and useful extensions already
401 available, ranging from helping to identify bugs to improving
402 performance.</para>
403
404 </sect1>
405 <sect1>
406 <title>Mercurial compared with other tools</title>
407
408 <para id="x_a5">Before you read on, please understand that this section
409 necessarily reflects my own experiences, interests, and (dare I
410 say it) biases. I have used every one of the revision control
411 tools listed below, in most cases for several years at a
412 time.</para>
413
414
415 <sect2>
416 <title>Subversion</title>
417
418 <para id="x_a6">Subversion is a popular revision control tool, developed
419 to replace CVS. It has a centralised client/server
420 architecture.</para>
421
422 <para id="x_a7">Subversion and Mercurial have similarly named commands for
423 performing the same operations, so if you're familiar with
424 one, it is easy to learn to use the other. Both tools are
425 portable to all popular operating systems.</para>
426
427 <para id="x_a8">Prior to version 1.5, Subversion had no useful support for
428 merges. At the time of writing, its merge tracking capability
429 is new, and known to be <ulink
430 url="http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword">complicated
431 and buggy</ulink>.</para>
432
433 <para id="x_a9">Mercurial has a substantial performance advantage over
434 Subversion on every revision control operation I have
435 benchmarked. I have measured its advantage as ranging from a
436 factor of two to a factor of six when compared with Subversion
437 1.4.3's <emphasis>ra_local</emphasis> file store, which is the
438 fastest access method available. In more realistic
439 deployments involving a network-based store, Subversion will
440 be at a substantially larger disadvantage. Because many
441 Subversion commands must talk to the server and Subversion
442 does not have useful replication facilities, server capacity
443 and network bandwidth become bottlenecks for modestly large
444 projects.</para>
445
446 <para id="x_aa">Additionally, Subversion incurs substantial storage
447 overhead to avoid network transactions for a few common
448 operations, such as finding modified files
449 (<literal>status</literal>) and displaying modifications
450 against the current revision (<literal>diff</literal>). As a
451 result, a Subversion working copy is often the same size as,
452 or larger than, a Mercurial repository and working directory,
453 even though the Mercurial repository contains a complete
454 history of the project.</para>
455
456 <para id="x_ab">Subversion is widely supported by third party tools.
457 Mercurial currently lags considerably in this area. This gap
458 is closing, however, and indeed some of Mercurial's GUI tools
459 now outshine their Subversion equivalents. Like Mercurial,
460 Subversion has an excellent user manual.</para>
461
462 <para id="x_ac">Because Subversion doesn't store revision history on the
463 client, it is well suited to managing projects that deal with
464 lots of large, opaque binary files. If you check in fifty
465 revisions to an incompressible 10MB file, Subversion's
466 client-side space usage stays constant The space used by any
467 distributed SCM will grow rapidly in proportion to the number
468 of revisions, because the differences between each revision
469 are large.</para>
470
471 <para id="x_ad">In addition, it's often difficult or, more usually,
472 impossible to merge different versions of a binary file.
473 Subversion's ability to let a user lock a file, so that they
474 temporarily have the exclusive right to commit changes to it,
475 can be a significant advantage to a project where binary files
476 are widely used.</para>
477
478 <para id="x_ae">Mercurial can import revision history from a Subversion
479 repository. It can also export revision history to a
480 Subversion repository. This makes it easy to <quote>test the
481 waters</quote> and use Mercurial and Subversion in parallel
482 before deciding to switch. History conversion is incremental,
483 so you can perform an initial conversion, then small
484 additional conversions afterwards to bring in new
485 changes.</para>
486
487
488 </sect2>
489 <sect2>
490 <title>Git</title>
491
492 <para id="x_af">Git is a distributed revision control tool that was
493 developed for managing the Linux kernel source tree. Like
494 Mercurial, its early design was somewhat influenced by
495 Monotone.</para>
496
497 <para id="x_b0">Git has a very large command set, with version 1.5.0
498 providing 139 individual commands. It has something of a
499 reputation for being difficult to learn. Compared to Git,
500 Mercurial has a strong focus on simplicity.</para>
501
502 <para id="x_b1">In terms of performance, Git is extremely fast. In
503 several cases, it is faster than Mercurial, at least on Linux,
504 while Mercurial performs better on other operations. However,
505 on Windows, the performance and general level of support that
506 Git provides is, at the time of writing, far behind that of
507 Mercurial.</para>
508
509 <para id="x_b2">While a Mercurial repository needs no maintenance, a Git
510 repository requires frequent manual <quote>repacks</quote> of
511 its metadata. Without these, performance degrades, while
512 space usage grows rapidly. A server that contains many Git
513 repositories that are not rigorously and frequently repacked
514 will become heavily disk-bound during backups, and there have
515 been instances of daily backups taking far longer than 24
516 hours as a result. A freshly packed Git repository is
517 slightly smaller than a Mercurial repository, but an unpacked
518 repository is several orders of magnitude larger.</para>
519
520 <para id="x_b3">The core of Git is written in C. Many Git commands are
521 implemented as shell or Perl scripts, and the quality of these
522 scripts varies widely. I have encountered several instances
523 where scripts charged along blindly in the presence of errors
524 that should have been fatal.</para>
525
526 <para id="x_b4">Mercurial can import revision history from a Git
527 repository.</para>
528
529
530 </sect2>
531 <sect2>
532 <title>CVS</title>
533
534 <para id="x_b5">CVS is probably the most widely used revision control tool
535 in the world. Due to its age and internal untidiness, it has
536 been only lightly maintained for many years.</para>
537
538 <para id="x_b6">It has a centralised client/server architecture. It does
539 not group related file changes into atomic commits, making it
540 easy for people to <quote>break the build</quote>: one person
541 can successfully commit part of a change and then be blocked
542 by the need for a merge, causing other people to see only a
543 portion of the work they intended to do. This also affects
544 how you work with project history. If you want to see all of
545 the modifications someone made as part of a task, you will
546 need to manually inspect the descriptions and timestamps of
547 the changes made to each file involved (if you even know what
548 those files were).</para>
549
550 <para id="x_b7">CVS has a muddled notion of tags and branches that I will
551 not attempt to even describe. It does not support renaming of
552 files or directories well, making it easy to corrupt a
553 repository. It has almost no internal consistency checking
554 capabilities, so it is usually not even possible to tell
555 whether or how a repository is corrupt. I would not recommend
556 CVS for any project, existing or new.</para>
557
558 <para id="x_b8">Mercurial can import CVS revision history. However, there
559 are a few caveats that apply; these are true of every other
560 revision control tool's CVS importer, too. Due to CVS's lack
561 of atomic changes and unversioned filesystem hierarchy, it is
562 not possible to reconstruct CVS history completely accurately;
563 some guesswork is involved, and renames will usually not show
564 up. Because a lot of advanced CVS administration has to be
565 done by hand and is hence error-prone, it's common for CVS
566 importers to run into multiple problems with corrupted
567 repositories (completely bogus revision timestamps and files
568 that have remained locked for over a decade are just two of
569 the less interesting problems I can recall from personal
570 experience).</para>
571
572 <para id="x_b9">Mercurial can import revision history from a CVS
573 repository.</para>
574
575
576 </sect2>
577 <sect2>
578 <title>Commercial tools</title>
579
580 <para id="x_ba">Perforce has a centralised client/server architecture,
581 with no client-side caching of any data. Unlike modern
582 revision control tools, Perforce requires that a user run a
583 command to inform the server about every file they intend to
584 edit.</para>
585
586 <para id="x_bb">The performance of Perforce is quite good for small teams,
587 but it falls off rapidly as the number of users grows beyond a
588 few dozen. Modestly large Perforce installations require the
589 deployment of proxies to cope with the load their users
590 generate.</para>
591
592
593 </sect2>
594 <sect2>
595 <title>Choosing a revision control tool</title>
596
597 <para id="x_bc">With the exception of CVS, all of the tools listed above
598 have unique strengths that suit them to particular styles of
599 work. There is no single revision control tool that is best
600 in all situations.</para>
601
602 <para id="x_bd">As an example, Subversion is a good choice for working
603 with frequently edited binary files, due to its centralised
604 nature and support for file locking.</para>
605
606 <para id="x_be">I personally find Mercurial's properties of simplicity,
607 performance, and good merge support to be a compelling
608 combination that has served me well for several years.</para>
609
610
611 </sect2>
612 </sect1>
613 <sect1>
614 <title>Switching from another tool to Mercurial</title>
615
616 <para id="x_bf">Mercurial is bundled with an extension named <literal
617 role="hg-ext">convert</literal>, which can incrementally
618 import revision history from several other revision control
619 tools. By <quote>incremental</quote>, I mean that you can
620 convert all of a project's history to date in one go, then rerun
621 the conversion later to obtain new changes that happened after
622 the initial conversion.</para>
623
624 <para id="x_c0">The revision control tools supported by <literal
625 role="hg-ext">convert</literal> are as follows:</para>
626 <itemizedlist>
627 <listitem><para id="x_c1">Subversion</para></listitem>
628 <listitem><para id="x_c2">CVS</para></listitem>
629 <listitem><para id="x_c3">Git</para></listitem>
630 <listitem><para id="x_c4">Darcs</para></listitem></itemizedlist>
631
632 <para id="x_c5">In addition, <literal role="hg-ext">convert</literal> can
633 export changes from Mercurial to Subversion. This makes it
634 possible to try Subversion and Mercurial in parallel before
635 committing to a switchover, without risking the loss of any
636 work.</para>
637
638 <para id="x_c6">The <command role="hg-ext-convert">convert</command> command
639 is easy to use. Simply point it at the path or URL of the
640 source repository, optionally give it the name of the
641 destination repository, and it will start working. After the
642 initial conversion, just run the same command again to import
643 new changes.</para>
644 </sect1>
645
646 <sect1>
647 <title>A short history of revision control</title>
648
649 <para id="x_c7">The best known of the old-time revision control tools is
650 SCCS (Source Code Control System), which Marc Rochkind wrote at
651 Bell Labs, in the early 1970s. SCCS operated on individual
652 files, and required every person working on a project to have
653 access to a shared workspace on a single system. Only one
654 person could modify a file at any time; arbitration for access
655 to files was via locks. It was common for people to lock files,
656 and later forget to unlock them, preventing anyone else from
657 modifying those files without the help of an
658 administrator.</para>
659
660 <para id="x_c8">Walter Tichy developed a free alternative to SCCS in the
661 early 1980s; he called his program RCS (Revision Control System).
662 Like SCCS, RCS required developers to work in a single shared
663 workspace, and to lock files to prevent multiple people from
664 modifying them simultaneously.</para>
665
666 <para id="x_c9">Later in the 1980s, Dick Grune used RCS as a building block
667 for a set of shell scripts he initially called cmt, but then
668 renamed to CVS (Concurrent Versions System). The big innovation
669 of CVS was that it let developers work simultaneously and
670 somewhat independently in their own personal workspaces. The
671 personal workspaces prevented developers from stepping on each
672 other's toes all the time, as was common with SCCS and RCS. Each
673 developer had a copy of every project file, and could modify
674 their copies independently. They had to merge their edits prior
675 to committing changes to the central repository.</para>
676
677 <para id="x_ca">Brian Berliner took Grune's original scripts and rewrote
678 them in C, releasing in 1989 the code that has since developed
679 into the modern version of CVS. CVS subsequently acquired the
680 ability to operate over a network connection, giving it a
681 client/server architecture. CVS's architecture is centralised;
682 only the server has a copy of the history of the project. Client
683 workspaces just contain copies of recent versions of the
684 project's files, and a little metadata to tell them where the
685 server is. CVS has been enormously successful; it is probably
686 the world's most widely used revision control system.</para>
687
688 <para id="x_cb">In the early 1990s, Sun Microsystems developed an early
689 distributed revision control system, called TeamWare. A
690 TeamWare workspace contains a complete copy of the project's
691 history. TeamWare has no notion of a central repository. (CVS
692 relied upon RCS for its history storage; TeamWare used
693 SCCS.)</para>
694
695 <para id="x_cc">As the 1990s progressed, awareness grew of a number of
696 problems with CVS. It records simultaneous changes to multiple
697 files individually, instead of grouping them together as a
698 single logically atomic operation. It does not manage its file
699 hierarchy well; it is easy to make a mess of a repository by
700 renaming files and directories. Worse, its source code is
701 difficult to read and maintain, which made the <quote>pain
702 level</quote> of fixing these architectural problems
703 prohibitive.</para>
704
705 <para id="x_cd">In 2001, Jim Blandy and Karl Fogel, two developers who had
706 worked on CVS, started a project to replace it with a tool that
707 would have a better architecture and cleaner code. The result,
708 Subversion, does not stray from CVS's centralised client/server
709 model, but it adds multi-file atomic commits, better namespace
710 management, and a number of other features that make it a
711 generally better tool than CVS. Since its initial release, it
712 has rapidly grown in popularity.</para>
713
714 <para id="x_ce">More or less simultaneously, Graydon Hoare began working on
715 an ambitious distributed revision control system that he named
716 Monotone. While Monotone addresses many of CVS's design flaws
717 and has a peer-to-peer architecture, it goes beyond earlier (and
718 subsequent) revision control tools in a number of innovative
719 ways. It uses cryptographic hashes as identifiers, and has an
720 integral notion of <quote>trust</quote> for code from different
721 sources.</para>
722
723 <para id="x_cf">Mercurial began life in 2005. While a few aspects of its
724 design are influenced by Monotone, Mercurial focuses on ease of
725 use, high performance, and scalability to very large
726 projects.</para>
727 </sect1>
728 </chapter>
729
730 <!--
731 local variables:
732 sgml-parent-document: ("00book.xml" "book" "chapter")
733 end:
734 -->