Mercurial > hgbook
comparison en/ch01-intro.xml @ 652:863a82f13901
Basic progress on XML.
author | Bryan O'Sullivan <bos@serpentine.com> |
---|---|
date | Thu, 05 Feb 2009 22:45:48 -0800 |
parents | en/ch01-intro.tex@f72b7e6cbe90 |
children | 8631da51309b |
comparison
equal
deleted
inserted
replaced
651:cf006cabe489 | 652:863a82f13901 |
---|---|
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : --> | |
2 | |
3 <chapter> | |
4 <title>Introduction</title> | |
5 <para>\label{chap:intro}</para> | |
6 | |
7 <sect1> | |
8 <title>About revision control</title> | |
9 | |
10 <para>Revision control is the process of managing multiple | |
11 versions of a piece of information. In its simplest form, this | |
12 is something that many people do by hand: every time you modify | |
13 a file, save it under a new name that contains a number, each | |
14 one higher than the number of the preceding version.</para> | |
15 | |
16 <para>Manually managing multiple versions of even a single file is | |
17 an error-prone task, though, so software tools to help automate | |
18 this process have long been available. The earliest automated | |
19 revision control tools were intended to help a single user to | |
20 manage revisions of a single file. Over the past few decades, | |
21 the scope of revision control tools has expanded greatly; they | |
22 now manage multiple files, and help multiple people to work | |
23 together. The best modern revision control tools have no | |
24 problem coping with thousands of people working together on | |
25 projects that consist of hundreds of thousands of files.</para> | |
26 | |
27 <sect2> | |
28 <title>Why use revision control?</title> | |
29 | |
30 <para>There are a number of reasons why you or your team might | |
31 want to use an automated revision control tool for a | |
32 project.</para> | |
33 <itemizedlist> | |
34 <listitem><para>It will track the history and evolution of | |
35 your project, so you don't have to. For every change, | |
36 you'll have a log of <emphasis>who</emphasis> made it; | |
37 <emphasis>why</emphasis> they made it; | |
38 <emphasis>when</emphasis> they made it; and | |
39 <emphasis>what</emphasis> the change | |
40 was.</para></listitem> | |
41 <listitem><para>When you're working with other people, | |
42 revision control software makes it easier for you to | |
43 collaborate. For example, when people more or less | |
44 simultaneously make potentially incompatible changes, the | |
45 software will help you to identify and resolve those | |
46 conflicts.</para></listitem> | |
47 <listitem><para>It can help you to recover from mistakes. If | |
48 you make a change that later turns out to be in error, you | |
49 can revert to an earlier version of one or more files. In | |
50 fact, a <emphasis>really</emphasis> good revision control | |
51 tool will even help you to efficiently figure out exactly | |
52 when a problem was introduced (see section <xref | |
53 id="sec:undo:bisect"/> for details).</para></listitem> | |
54 <listitem><para>It will help you to work simultaneously on, | |
55 and manage the drift between, multiple versions of your | |
56 project.</para></listitem></itemizedlist> | |
57 <para>Most of these reasons are equally valid---at least in | |
58 theory---whether you're working on a project by yourself, or | |
59 with a hundred other people.</para> | |
60 | |
61 <para>A key question about the practicality of revision control | |
62 at these two different scales (<quote>lone hacker</quote> and | |
63 <quote>huge team</quote>) is how its | |
64 <emphasis>benefits</emphasis> compare to its | |
65 <emphasis>costs</emphasis>. A revision control tool that's | |
66 difficult to understand or use is going to impose a high | |
67 cost.</para> | |
68 | |
69 <para>A five-hundred-person project is likely to collapse under | |
70 its own weight almost immediately without a revision control | |
71 tool and process. In this case, the cost of using revision | |
72 control might hardly seem worth considering, since | |
73 <emphasis>without</emphasis> it, failure is almost | |
74 guaranteed.</para> | |
75 | |
76 <para>On the other hand, a one-person <quote>quick hack</quote> | |
77 might seem like a poor place to use a revision control tool, | |
78 because surely the cost of using one must be close to the | |
79 overall cost of the project. Right?</para> | |
80 | |
81 <para>Mercurial uniquely supports <emphasis>both</emphasis> of | |
82 these scales of development. You can learn the basics in just | |
83 a few minutes, and due to its low overhead, you can apply | |
84 revision control to the smallest of projects with ease. Its | |
85 simplicity means you won't have a lot of abstruse concepts or | |
86 command sequences competing for mental space with whatever | |
87 you're <emphasis>really</emphasis> trying to do. At the same | |
88 time, Mercurial's high performance and peer-to-peer nature let | |
89 you scale painlessly to handle large projects.</para> | |
90 | |
91 <para>No revision control tool can rescue a poorly run project, | |
92 but a good choice of tools can make a huge difference to the | |
93 fluidity with which you can work on a project.</para> | |
94 | |
95 </sect2> | |
96 <sect2> | |
97 <title>The many names of revision control</title> | |
98 | |
99 <para>Revision control is a diverse field, so much so that it | |
100 doesn't actually have a single name or acronym. Here are a | |
101 few of the more common names and acronyms you'll | |
102 encounter:</para> | |
103 <itemizedlist> | |
104 <listitem><para>Revision control (RCS)</para></listitem> | |
105 <listitem><para>Software configuration management (SCM), or | |
106 configuration management</para></listitem> | |
107 <listitem><para>Source code management</para></listitem> | |
108 <listitem><para>Source code control, or source | |
109 control</para></listitem> | |
110 <listitem><para>Version control | |
111 (VCS)</para></listitem></itemizedlist> | |
112 <para>Some people claim that these terms actually have different | |
113 meanings, but in practice they overlap so much that there's no | |
114 agreed or even useful way to tease them apart.</para> | |
115 | |
116 </sect2> | |
117 </sect1> | |
118 <sect1> | |
119 <title>A short history of revision control</title> | |
120 | |
121 <para>The best known of the old-time revision control tools is | |
122 SCCS (Source Code Control System), which Marc Rochkind wrote at | |
123 Bell Labs, in the early 1970s. SCCS operated on individual | |
124 files, and required every person working on a project to have | |
125 access to a shared workspace on a single system. Only one | |
126 person could modify a file at any time; arbitration for access | |
127 to files was via locks. It was common for people to lock files, | |
128 and later forget to unlock them, preventing anyone else from | |
129 modifying those files without the help of an | |
130 administrator.</para> | |
131 | |
132 <para>Walter Tichy developed a free alternative to SCCS in the | |
133 early 1980s; he called his program RCS (Revison Control System). | |
134 Like SCCS, RCS required developers to work in a single shared | |
135 workspace, and to lock files to prevent multiple people from | |
136 modifying them simultaneously.</para> | |
137 | |
138 <para>Later in the 1980s, Dick Grune used RCS as a building block | |
139 for a set of shell scripts he initially called cmt, but then | |
140 renamed to CVS (Concurrent Versions System). The big innovation | |
141 of CVS was that it let developers work simultaneously and | |
142 somewhat independently in their own personal workspaces. The | |
143 personal workspaces prevented developers from stepping on each | |
144 other's toes all the time, as was common with SCCS and RCS. Each | |
145 developer had a copy of every project file, and could modify | |
146 their copies independently. They had to merge their edits prior | |
147 to committing changes to the central repository.</para> | |
148 | |
149 <para>Brian Berliner took Grune's original scripts and rewrote | |
150 them in C, releasing in 1989 the code that has since developed | |
151 into the modern version of CVS. CVS subsequently acquired the | |
152 ability to operate over a network connection, giving it a | |
153 client/server architecture. CVS's architecture is centralised; | |
154 only the server has a copy of the history of the project. Client | |
155 workspaces just contain copies of recent versions of the | |
156 project's files, and a little metadata to tell them where the | |
157 server is. CVS has been enormously successful; it is probably | |
158 the world's most widely used revision control system.</para> | |
159 | |
160 <para>In the early 1990s, Sun Microsystems developed an early | |
161 distributed revision control system, called TeamWare. A | |
162 TeamWare workspace contains a complete copy of the project's | |
163 history. TeamWare has no notion of a central repository. (CVS | |
164 relied upon RCS for its history storage; TeamWare used | |
165 SCCS.)</para> | |
166 | |
167 <para>As the 1990s progressed, awareness grew of a number of | |
168 problems with CVS. It records simultaneous changes to multiple | |
169 files individually, instead of grouping them together as a | |
170 single logically atomic operation. It does not manage its file | |
171 hierarchy well; it is easy to make a mess of a repository by | |
172 renaming files and directories. Worse, its source code is | |
173 difficult to read and maintain, which made the <quote>pain | |
174 level</quote> of fixing these architectural problems | |
175 prohibitive.</para> | |
176 | |
177 <para>In 2001, Jim Blandy and Karl Fogel, two developers who had | |
178 worked on CVS, started a project to replace it with a tool that | |
179 would have a better architecture and cleaner code. The result, | |
180 Subversion, does not stray from CVS's centralised client/server | |
181 model, but it adds multi-file atomic commits, better namespace | |
182 management, and a number of other features that make it a | |
183 generally better tool than CVS. Since its initial release, it | |
184 has rapidly grown in popularity.</para> | |
185 | |
186 <para>More or less simultaneously, Graydon Hoare began working on | |
187 an ambitious distributed revision control system that he named | |
188 Monotone. While Monotone addresses many of CVS's design flaws | |
189 and has a peer-to-peer architecture, it goes beyond earlier (and | |
190 subsequent) revision control tools in a number of innovative | |
191 ways. It uses cryptographic hashes as identifiers, and has an | |
192 integral notion of <quote>trust</quote> for code from different | |
193 sources.</para> | |
194 | |
195 <para>Mercurial began life in 2005. While a few aspects of its | |
196 design are influenced by Monotone, Mercurial focuses on ease of | |
197 use, high performance, and scalability to very large | |
198 projects.</para> | |
199 | |
200 </sect1> | |
201 <sect1> | |
202 <title>Trends in revision control</title> | |
203 | |
204 <para>There has been an unmistakable trend in the development and | |
205 use of revision control tools over the past four decades, as | |
206 people have become familiar with the capabilities of their tools | |
207 and constrained by their limitations.</para> | |
208 | |
209 <para>The first generation began by managing single files on | |
210 individual computers. Although these tools represented a huge | |
211 advance over ad-hoc manual revision control, their locking model | |
212 and reliance on a single computer limited them to small, | |
213 tightly-knit teams.</para> | |
214 | |
215 <para>The second generation loosened these constraints by moving | |
216 to network-centered architectures, and managing entire projects | |
217 at a time. As projects grew larger, they ran into new problems. | |
218 With clients needing to talk to servers very frequently, server | |
219 scaling became an issue for large projects. An unreliable | |
220 network connection could prevent remote users from being able to | |
221 talk to the server at all. As open source projects started | |
222 making read-only access available anonymously to anyone, people | |
223 without commit privileges found that they could not use the | |
224 tools to interact with a project in a natural way, as they could | |
225 not record their changes.</para> | |
226 | |
227 <para>The current generation of revision control tools is | |
228 peer-to-peer in nature. All of these systems have dropped the | |
229 dependency on a single central server, and allow people to | |
230 distribute their revision control data to where it's actually | |
231 needed. Collaboration over the Internet has moved from | |
232 constrained by technology to a matter of choice and consensus. | |
233 Modern tools can operate offline indefinitely and autonomously, | |
234 with a network connection only needed when syncing changes with | |
235 another repository.</para> | |
236 | |
237 </sect1> | |
238 <sect1> | |
239 <title>A few of the advantages of distributed revision | |
240 control</title> | |
241 | |
242 <para>Even though distributed revision control tools have for | |
243 several years been as robust and usable as their | |
244 previous-generation counterparts, people using older tools have | |
245 not yet necessarily woken up to their advantages. There are a | |
246 number of ways in which distributed tools shine relative to | |
247 centralised ones.</para> | |
248 | |
249 <para>For an individual developer, distributed tools are almost | |
250 always much faster than centralised tools. This is for a simple | |
251 reason: a centralised tool needs to talk over the network for | |
252 many common operations, because most metadata is stored in a | |
253 single copy on the central server. A distributed tool stores | |
254 all of its metadata locally. All else being equal, talking over | |
255 the network adds overhead to a centralised tool. Don't | |
256 underestimate the value of a snappy, responsive tool: you're | |
257 going to spend a lot of time interacting with your revision | |
258 control software.</para> | |
259 | |
260 <para>Distributed tools are indifferent to the vagaries of your | |
261 server infrastructure, again because they replicate metadata to | |
262 so many locations. If you use a centralised system and your | |
263 server catches fire, you'd better hope that your backup media | |
264 are reliable, and that your last backup was recent and actually | |
265 worked. With a distributed tool, you have many backups | |
266 available on every contributor's computer.</para> | |
267 | |
268 <para>The reliability of your network will affect distributed | |
269 tools far less than it will centralised tools. You can't even | |
270 use a centralised tool without a network connection, except for | |
271 a few highly constrained commands. With a distributed tool, if | |
272 your network connection goes down while you're working, you may | |
273 not even notice. The only thing you won't be able to do is talk | |
274 to repositories on other computers, something that is relatively | |
275 rare compared with local operations. If you have a far-flung | |
276 team of collaborators, this may be significant.</para> | |
277 | |
278 <sect2> | |
279 <title>Advantages for open source projects</title> | |
280 | |
281 <para>If you take a shine to an open source project and decide | |
282 that you would like to start hacking on it, and that project | |
283 uses a distributed revision control tool, you are at once a | |
284 peer with the people who consider themselves the | |
285 <quote>core</quote> of that project. If they publish their | |
286 repositories, you can immediately copy their project history, | |
287 start making changes, and record your work, using the same | |
288 tools in the same ways as insiders. By contrast, with a | |
289 centralised tool, you must use the software in a <quote>read | |
290 only</quote> mode unless someone grants you permission to | |
291 commit changes to their central server. Until then, you won't | |
292 be able to record changes, and your local modifications will | |
293 be at risk of corruption any time you try to update your | |
294 client's view of the repository.</para> | |
295 | |
296 <sect3> | |
297 <title>The forking non-problem</title> | |
298 | |
299 <para>It has been suggested that distributed revision control | |
300 tools pose some sort of risk to open source projects because | |
301 they make it easy to <quote>fork</quote> the development of | |
302 a project. A fork happens when there are differences in | |
303 opinion or attitude between groups of developers that cause | |
304 them to decide that they can't work together any longer. | |
305 Each side takes a more or less complete copy of the | |
306 project's source code, and goes off in its own | |
307 direction.</para> | |
308 | |
309 <para>Sometimes the camps in a fork decide to reconcile their | |
310 differences. With a centralised revision control system, the | |
311 <emphasis>technical</emphasis> process of reconciliation is | |
312 painful, and has to be performed largely by hand. You have | |
313 to decide whose revision history is going to | |
314 <quote>win</quote>, and graft the other team's changes into | |
315 the tree somehow. This usually loses some or all of one | |
316 side's revision history.</para> | |
317 | |
318 <para>What distributed tools do with respect to forking is | |
319 they make forking the <emphasis>only</emphasis> way to | |
320 develop a project. Every single change that you make is | |
321 potentially a fork point. The great strength of this | |
322 approach is that a distributed revision control tool has to | |
323 be really good at <emphasis>merging</emphasis> forks, | |
324 because forks are absolutely fundamental: they happen all | |
325 the time.</para> | |
326 | |
327 <para>If every piece of work that everybody does, all the | |
328 time, is framed in terms of forking and merging, then what | |
329 the open source world refers to as a <quote>fork</quote> | |
330 becomes <emphasis>purely</emphasis> a social issue. If | |
331 anything, distributed tools <emphasis>lower</emphasis> the | |
332 likelihood of a fork:</para> | |
333 <itemizedlist> | |
334 <listitem><para>They eliminate the social distinction that | |
335 centralised tools impose: that between insiders (people | |
336 with commit access) and outsiders (people | |
337 without).</para></listitem> | |
338 <listitem><para>They make it easier to reconcile after a | |
339 social fork, because all that's involved from the | |
340 perspective of the revision control software is just | |
341 another merge.</para></listitem></itemizedlist> | |
342 | |
343 <para>Some people resist distributed tools because they want | |
344 to retain tight control over their projects, and they | |
345 believe that centralised tools give them this control. | |
346 However, if you're of this belief, and you publish your CVS | |
347 or Subversion repositories publically, there are plenty of | |
348 tools available that can pull out your entire project's | |
349 history (albeit slowly) and recreate it somewhere that you | |
350 don't control. So while your control in this case is | |
351 illusory, you are forgoing the ability to fluidly | |
352 collaborate with whatever people feel compelled to mirror | |
353 and fork your history.</para> | |
354 | |
355 </sect3> | |
356 </sect2> | |
357 <sect2> | |
358 <title>Advantages for commercial projects</title> | |
359 | |
360 <para>Many commercial projects are undertaken by teams that are | |
361 scattered across the globe. Contributors who are far from a | |
362 central server will see slower command execution and perhaps | |
363 less reliability. Commercial revision control systems attempt | |
364 to ameliorate these problems with remote-site replication | |
365 add-ons that are typically expensive to buy and cantankerous | |
366 to administer. A distributed system doesn't suffer from these | |
367 problems in the first place. Better yet, you can easily set | |
368 up multiple authoritative servers, say one per site, so that | |
369 there's no redundant communication between repositories over | |
370 expensive long-haul network links.</para> | |
371 | |
372 <para>Centralised revision control systems tend to have | |
373 relatively low scalability. It's not unusual for an expensive | |
374 centralised system to fall over under the combined load of | |
375 just a few dozen concurrent users. Once again, the typical | |
376 response tends to be an expensive and clunky replication | |
377 facility. Since the load on a central server---if you have | |
378 one at all---is many times lower with a distributed tool | |
379 (because all of the data is replicated everywhere), a single | |
380 cheap server can handle the needs of a much larger team, and | |
381 replication to balance load becomes a simple matter of | |
382 scripting.</para> | |
383 | |
384 <para>If you have an employee in the field, troubleshooting a | |
385 problem at a customer's site, they'll benefit from distributed | |
386 revision control. The tool will let them generate custom | |
387 builds, try different fixes in isolation from each other, and | |
388 search efficiently through history for the sources of bugs and | |
389 regressions in the customer's environment, all without needing | |
390 to connect to your company's network.</para> | |
391 | |
392 </sect2> | |
393 </sect1> | |
394 <sect1> | |
395 <title>Why choose Mercurial?</title> | |
396 | |
397 <para>Mercurial has a unique set of properties that make it a | |
398 particularly good choice as a revision control system.</para> | |
399 <itemizedlist> | |
400 <listitem><para>It is easy to learn and use.</para></listitem> | |
401 <listitem><para>It is lightweight.</para></listitem> | |
402 <listitem><para>It scales excellently.</para></listitem> | |
403 <listitem><para>It is easy to | |
404 customise.</para></listitem></itemizedlist> | |
405 | |
406 <para>If you are at all familiar with revision control systems, | |
407 you should be able to get up and running with Mercurial in less | |
408 than five minutes. Even if not, it will take no more than a few | |
409 minutes longer. Mercurial's command and feature sets are | |
410 generally uniform and consistent, so you can keep track of a few | |
411 general rules instead of a host of exceptions.</para> | |
412 | |
413 <para>On a small project, you can start working with Mercurial in | |
414 moments. Creating new changes and branches; transferring changes | |
415 around (whether locally or over a network); and history and | |
416 status operations are all fast. Mercurial attempts to stay | |
417 nimble and largely out of your way by combining low cognitive | |
418 overhead with blazingly fast operations.</para> | |
419 | |
420 <para>The usefulness of Mercurial is not limited to small | |
421 projects: it is used by projects with hundreds to thousands of | |
422 contributors, each containing tens of thousands of files and | |
423 hundreds of megabytes of source code.</para> | |
424 | |
425 <para>If the core functionality of Mercurial is not enough for | |
426 you, it's easy to build on. Mercurial is well suited to | |
427 scripting tasks, and its clean internals and implementation in | |
428 Python make it easy to add features in the form of extensions. | |
429 There are a number of popular and useful extensions already | |
430 available, ranging from helping to identify bugs to improving | |
431 performance.</para> | |
432 | |
433 </sect1> | |
434 <sect1> | |
435 <title>Mercurial compared with other tools</title> | |
436 | |
437 <para>Before you read on, please understand that this section | |
438 necessarily reflects my own experiences, interests, and (dare I | |
439 say it) biases. I have used every one of the revision control | |
440 tools listed below, in most cases for several years at a | |
441 time.</para> | |
442 | |
443 | |
444 <sect2> | |
445 <title>Subversion</title> | |
446 | |
447 <para>Subversion is a popular revision control tool, developed | |
448 to replace CVS. It has a centralised client/server | |
449 architecture.</para> | |
450 | |
451 <para>Subversion and Mercurial have similarly named commands for | |
452 performing the same operations, so if you're familiar with | |
453 one, it is easy to learn to use the other. Both tools are | |
454 portable to all popular operating systems.</para> | |
455 | |
456 <para>Prior to version 1.5, Subversion had no useful support for | |
457 merges. At the time of writing, its merge tracking capability | |
458 is new, and known to be <ulink | |
459 url="http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword">complicated | |
460 and buggy</ulink>.</para> | |
461 | |
462 <para>Mercurial has a substantial performance advantage over | |
463 Subversion on every revision control operation I have | |
464 benchmarked. I have measured its advantage as ranging from a | |
465 factor of two to a factor of six when compared with Subversion | |
466 1.4.3's <emphasis>ra_local</emphasis> file store, which is the | |
467 fastest access method available. In more realistic | |
468 deployments involving a network-based store, Subversion will | |
469 be at a substantially larger disadvantage. Because many | |
470 Subversion commands must talk to the server and Subversion | |
471 does not have useful replication facilities, server capacity | |
472 and network bandwidth become bottlenecks for modestly large | |
473 projects.</para> | |
474 | |
475 <para>Additionally, Subversion incurs substantial storage | |
476 overhead to avoid network transactions for a few common | |
477 operations, such as finding modified files | |
478 (<literal>status</literal>) and displaying modifications | |
479 against the current revision (<literal>diff</literal>). As a | |
480 result, a Subversion working copy is often the same size as, | |
481 or larger than, a Mercurial repository and working directory, | |
482 even though the Mercurial repository contains a complete | |
483 history of the project.</para> | |
484 | |
485 <para>Subversion is widely supported by third party tools. | |
486 Mercurial currently lags considerably in this area. This gap | |
487 is closing, however, and indeed some of Mercurial's GUI tools | |
488 now outshine their Subversion equivalents. Like Mercurial, | |
489 Subversion has an excellent user manual.</para> | |
490 | |
491 <para>Because Subversion doesn't store revision history on the | |
492 client, it is well suited to managing projects that deal with | |
493 lots of large, opaque binary files. If you check in fifty | |
494 revisions to an incompressible 10MB file, Subversion's | |
495 client-side space usage stays constant The space used by any | |
496 distributed SCM will grow rapidly in proportion to the number | |
497 of revisions, because the differences between each revision | |
498 are large.</para> | |
499 | |
500 <para>In addition, it's often difficult or, more usually, | |
501 impossible to merge different versions of a binary file. | |
502 Subversion's ability to let a user lock a file, so that they | |
503 temporarily have the exclusive right to commit changes to it, | |
504 can be a significant advantage to a project where binary files | |
505 are widely used.</para> | |
506 | |
507 <para>Mercurial can import revision history from a Subversion | |
508 repository. It can also export revision history to a | |
509 Subversion repository. This makes it easy to <quote>test the | |
510 waters</quote> and use Mercurial and Subversion in parallel | |
511 before deciding to switch. History conversion is incremental, | |
512 so you can perform an initial conversion, then small | |
513 additional conversions afterwards to bring in new | |
514 changes.</para> | |
515 | |
516 | |
517 </sect2> | |
518 <sect2> | |
519 <title>Git</title> | |
520 | |
521 <para>Git is a distributed revision control tool that was | |
522 developed for managing the Linux kernel source tree. Like | |
523 Mercurial, its early design was somewhat influenced by | |
524 Monotone.</para> | |
525 | |
526 <para>Git has a very large command set, with version 1.5.0 | |
527 providing 139 individual commands. It has something of a | |
528 reputation for being difficult to learn. Compared to Git, | |
529 Mercurial has a strong focus on simplicity.</para> | |
530 | |
531 <para>In terms of performance, Git is extremely fast. In | |
532 several cases, it is faster than Mercurial, at least on Linux, | |
533 while Mercurial performs better on other operations. However, | |
534 on Windows, the performance and general level of support that | |
535 Git provides is, at the time of writing, far behind that of | |
536 Mercurial.</para> | |
537 | |
538 <para>While a Mercurial repository needs no maintenance, a Git | |
539 repository requires frequent manual <quote>repacks</quote> of | |
540 its metadata. Without these, performance degrades, while | |
541 space usage grows rapidly. A server that contains many Git | |
542 repositories that are not rigorously and frequently repacked | |
543 will become heavily disk-bound during backups, and there have | |
544 been instances of daily backups taking far longer than 24 | |
545 hours as a result. A freshly packed Git repository is | |
546 slightly smaller than a Mercurial repository, but an unpacked | |
547 repository is several orders of magnitude larger.</para> | |
548 | |
549 <para>The core of Git is written in C. Many Git commands are | |
550 implemented as shell or Perl scripts, and the quality of these | |
551 scripts varies widely. I have encountered several instances | |
552 where scripts charged along blindly in the presence of errors | |
553 that should have been fatal.</para> | |
554 | |
555 <para>Mercurial can import revision history from a Git | |
556 repository.</para> | |
557 | |
558 | |
559 </sect2> | |
560 <sect2> | |
561 <title>CVS</title> | |
562 | |
563 <para>CVS is probably the most widely used revision control tool | |
564 in the world. Due to its age and internal untidiness, it has | |
565 been only lightly maintained for many years.</para> | |
566 | |
567 <para>It has a centralised client/server architecture. It does | |
568 not group related file changes into atomic commits, making it | |
569 easy for people to <quote>break the build</quote>: one person | |
570 can successfully commit part of a change and then be blocked | |
571 by the need for a merge, causing other people to see only a | |
572 portion of the work they intended to do. This also affects | |
573 how you work with project history. If you want to see all of | |
574 the modifications someone made as part of a task, you will | |
575 need to manually inspect the descriptions and timestamps of | |
576 the changes made to each file involved (if you even know what | |
577 those files were).</para> | |
578 | |
579 <para>CVS has a muddled notion of tags and branches that I will | |
580 not attempt to even describe. It does not support renaming of | |
581 files or directories well, making it easy to corrupt a | |
582 repository. It has almost no internal consistency checking | |
583 capabilities, so it is usually not even possible to tell | |
584 whether or how a repository is corrupt. I would not recommend | |
585 CVS for any project, existing or new.</para> | |
586 | |
587 <para>Mercurial can import CVS revision history. However, there | |
588 are a few caveats that apply; these are true of every other | |
589 revision control tool's CVS importer, too. Due to CVS's lack | |
590 of atomic changes and unversioned filesystem hierarchy, it is | |
591 not possible to reconstruct CVS history completely accurately; | |
592 some guesswork is involved, and renames will usually not show | |
593 up. Because a lot of advanced CVS administration has to be | |
594 done by hand and is hence error-prone, it's common for CVS | |
595 importers to run into multiple problems with corrupted | |
596 repositories (completely bogus revision timestamps and files | |
597 that have remained locked for over a decade are just two of | |
598 the less interesting problems I can recall from personal | |
599 experience).</para> | |
600 | |
601 <para>Mercurial can import revision history from a CVS | |
602 repository.</para> | |
603 | |
604 | |
605 </sect2> | |
606 <sect2> | |
607 <title>Commercial tools</title> | |
608 | |
609 <para>Perforce has a centralised client/server architecture, | |
610 with no client-side caching of any data. Unlike modern | |
611 revision control tools, Perforce requires that a user run a | |
612 command to inform the server about every file they intend to | |
613 edit.</para> | |
614 | |
615 <para>The performance of Perforce is quite good for small teams, | |
616 but it falls off rapidly as the number of users grows beyond a | |
617 few dozen. Modestly large Perforce installations require the | |
618 deployment of proxies to cope with the load their users | |
619 generate.</para> | |
620 | |
621 | |
622 </sect2> | |
623 <sect2> | |
624 <title>Choosing a revision control tool</title> | |
625 | |
626 <para>With the exception of CVS, all of the tools listed above | |
627 have unique strengths that suit them to particular styles of | |
628 work. There is no single revision control tool that is best | |
629 in all situations.</para> | |
630 | |
631 <para>As an example, Subversion is a good choice for working | |
632 with frequently edited binary files, due to its centralised | |
633 nature and support for file locking.</para> | |
634 | |
635 <para>I personally find Mercurial's properties of simplicity, | |
636 performance, and good merge support to be a compelling | |
637 combination that has served me well for several years.</para> | |
638 | |
639 | |
640 </sect2> | |
641 </sect1> | |
642 <sect1> | |
643 <title>Switching from another tool to Mercurial</title> | |
644 | |
645 <para>Mercurial is bundled with an extension named <literal | |
646 role="hg-ext">convert</literal>, which can incrementally | |
647 import revision history from several other revision control | |
648 tools. By <quote>incremental</quote>, I mean that you can | |
649 convert all of a project's history to date in one go, then rerun | |
650 the conversion later to obtain new changes that happened after | |
651 the initial conversion.</para> | |
652 | |
653 <para>The revision control tools supported by <literal | |
654 role="hg-ext">convert</literal> are as follows:</para> | |
655 <itemizedlist> | |
656 <listitem><para>Subversion</para></listitem> | |
657 <listitem><para>CVS</para></listitem> | |
658 <listitem><para>Git</para></listitem> | |
659 <listitem><para>Darcs</para></listitem></itemizedlist> | |
660 | |
661 <para>In addition, <literal role="hg-ext">convert</literal> can | |
662 export changes from Mercurial to Subversion. This makes it | |
663 possible to try Subversion and Mercurial in parallel before | |
664 committing to a switchover, without risking the loss of any | |
665 work.</para> | |
666 | |
667 <para>The <command role="hg-ext-conver">convert</command> command | |
668 is easy to use. Simply point it at the path or URL of the | |
669 source repository, optionally give it the name of the | |
670 destination repository, and it will start working. After the | |
671 initial conversion, just run the same command again to import | |
672 new changes.</para> | |
673 </sect1> | |
674 </chapter> | |
675 | |
676 <!-- | |
677 local variables: | |
678 sgml-parent-document: ("00book.xml" "book" "chapter") | |
679 end: | |
680 --> |