comparison en/intro.tex @ 219:15a6fd2ba582

Start talking about the advantages of distributed tools.
author Bryan O'Sullivan <bos@serpentine.com>
date Mon, 14 May 2007 11:20:34 -0700
parents 75fd236d736b
children 0ca9045035f7
comparison
equal deleted inserted replaced
218:75fd236d736b 219:15a6fd2ba582
1 \chapter{Introduction} 1 \chapter{Introduction}
2 \label{chap:intro} 2 \label{chap:intro}
3 3
4 \section{About revision control} 4 \section{About revision control}
5 5
6 Revision control is the management of multiple versions of a piece of 6 Revision control is the process of managing multiple versions of a
7 information. In its simplest form, it's a process that many people 7 piece of information. In its simplest form, this is something that
8 perform by hand: every time you modify a file, save it under a new 8 many people do by hand: every time you modify a file, save it under a
9 name that contains a number, each one higher than the number of the 9 new name that contains a number, each one higher than the number of
10 preceding version. 10 the preceding version.
11 11
12 Manually managing multiple versions of even a single file is an 12 Manually managing multiple versions of even a single file is an
13 error-prone task, though, so software tools to help automate this 13 error-prone task, though, so software tools to help automate this
14 process have long been available. The earliest automated revision 14 process have long been available. The earliest automated revision
15 control tools were intended to help a single user to manage revisions 15 control tools were intended to help a single user to manage revisions
16 to a single file. Over the past several decades, the scope of 16 of a single file. Over the past few decades, the scope of revision
17 revision control tools has expanded greatly; they now manage multiple 17 control tools has expanded greatly; they now manage multiple files,
18 files, and help multiple people to work together. The best modern 18 and help multiple people to work together. The best modern revision
19 revision control tools will have no problem coping with thousands of 19 control tools have no problem coping with thousands of people working
20 people working together on a single project, which might consist of 20 together on projects that consist of hundreds of thousands of files.
21 hundreds of thousands of files.
22 21
23 \subsection{Why use revision control?} 22 \subsection{Why use revision control?}
24 23
25 There are a number of reasons why you or your team might want to use 24 There are a number of reasons why you or your team might want to use
26 an automated revision control tool for a project. 25 an automated revision control tool for a project.
27 \begin{itemize} 26 \begin{itemize}
28 \item The software gives you a unified way of working with your 27 \item It will track the history and evolution of your project, so you
29 project's files. 28 don't have to. For every change, you'll have a log of \emph{who}
30 \item When you're working with other people, it makes it easier for 29 made it; \emph{why} they made it; \emph{when} they made it; and
31 you to collaborate. For example, when people more or less 30 \emph{what} the change was.
32 simultaneously make potentially incompatible changes, the software 31 \item When you're working with other people, revision control software
33 will help you to identify and resolve those conflicts. 32 makes it easier for you to collaborate. For example, when people
34 \item It will track the history of your project. For every change, 33 more or less simultaneously make potentially incompatible changes,
35 you'll have a log of \emph{who} made it; \emph{why} they made it; 34 the software will help you to identify and resolve those conflicts.
36 \emph{when} they made it; and \emph{what} the change was.
37 \item It can help you to recover from mistakes. If you make a change 35 \item It can help you to recover from mistakes. If you make a change
38 that later turns out to be in error, you can revert to an earlier 36 that later turns out to be in error, you can revert to an earlier
39 version of one or more files. In fact, a \emph{really} good 37 version of one or more files. In fact, a \emph{really} good
40 revision control tool will even help you to efficiently figure out 38 revision control tool will even help you to efficiently figure out
41 exactly when a problem was introduced (see 39 exactly when a problem was introduced (see
50 A key question about the practicality of revision control at these two 48 A key question about the practicality of revision control at these two
51 different scales (``lone hacker'' and ``huge team'') is how its 49 different scales (``lone hacker'' and ``huge team'') is how its
52 \emph{benefits} compare to its \emph{costs}. A revision control tool 50 \emph{benefits} compare to its \emph{costs}. A revision control tool
53 that's difficult to understand or use is going to impose a high cost. 51 that's difficult to understand or use is going to impose a high cost.
54 52
55 For example, a five-hundred-person project is likely to collapse under 53 A five-hundred-person project is likely to collapse under its own
56 its own weight almost immediately without a revision control tool and 54 weight almost immediately without a revision control tool and process.
57 process. In this case, the cost of using revision control might 55 In this case, the cost of using revision control might hardly seem
58 hardly seem worth considering, since \emph{without} it, failure is 56 worth considering, since \emph{without} it, failure is almost
59 almost guaranteed. 57 guaranteed.
60 58
61 On the other hand, a one-person ``quick hack'' might seem like a poor 59 On the other hand, a one-person ``quick hack'' might seem like a poor
62 place to use a revision control tool, because surely the cost of using 60 place to use a revision control tool, because surely the cost of using
63 one must be close to the overall cost of the project. Right? 61 one must be close to the overall cost of the project. Right?
64 62
69 abstruse concepts or command sequences competing for mental space with 67 abstruse concepts or command sequences competing for mental space with
70 whatever you're \emph{really} trying to do. At the same time, 68 whatever you're \emph{really} trying to do. At the same time,
71 Mercurial's high performance and peer-to-peer nature let you scale 69 Mercurial's high performance and peer-to-peer nature let you scale
72 painlessly to handle large projects. 70 painlessly to handle large projects.
73 71
72 No revision control tool can rescue a poorly run project, but a good
73 choice of tools can make a huge difference to the fluidity with which
74 you can work on a project.
75
74 \subsection{The many names of revision control} 76 \subsection{The many names of revision control}
75 77
76 Revision control is a diverse field, so much so that it doesn't 78 Revision control is a diverse field, so much so that it doesn't
77 actually have a single name or acronym. Here are a few of the more 79 actually have a single name or acronym. Here are a few of the more
78 common names and acronyms you'll encounter: 80 common names and acronyms you'll encounter:
79 \begin{itemize} 81 \begin{itemize}
80 \item Configuration management (CM)
81 \item Revision control (RCS) 82 \item Revision control (RCS)
82 \item Software configuration management (SCM) 83 \item Software configuration management (SCM), or configuration management
83 \item Source code management 84 \item Source code management
84 \item Source control 85 \item Source code control, or source control
85 \item Version control (VCS) 86 \item Version control (VCS)
86 \end{itemize} 87 \end{itemize}
87 Some people claim that these terms actually have different meanings, 88 Some people claim that these terms actually have different meanings,
88 but in practice they overlap so much that there's no agreed or even 89 but in practice they overlap so much that there's no agreed or even
89 useful way to tease them apart. 90 useful way to tease them apart.
90 91
91 \section{A short history and hierarchy of revision control} 92 \section{A short history of revision control}
92 93
93 The best known of the old-time revision control tools is SCCS (Source 94 The best known of the old-time revision control tools is SCCS (Source
94 Code Control System), which Marc Rochkind wrote at Bell Labs, in the 95 Code Control System), which Marc Rochkind wrote at Bell Labs, in the
95 early 1970s. SCCS operated on individual files, and required every 96 early 1970s. SCCS operated on individual files, and required every
96 person working on a project to have access to a shared workspace on a 97 person working on a project to have access to a shared workspace on a
157 158
158 Mercurial began life in 2005. While a few aspects of its design are 159 Mercurial began life in 2005. While a few aspects of its design are
159 influenced by Monotone, Mercurial focuses on ease of use, high 160 influenced by Monotone, Mercurial focuses on ease of use, high
160 performance, and scalability to very large projects. 161 performance, and scalability to very large projects.
161 162
162 \subsection{On a single system} 163 \section{Trends in revision control}
163 164
164 \subsection{Network-based, but centralised} 165 There has been an unmistakable trend in the development and use of
165 166 revision control tools over the past four decades, as people have
166 \subsection{Fully distributed} 167 become familiar with the capabilities of their tools and constrained
167 168 by their limitations.
168 169
169 \section{Advantages of distributed revision control} 170 The first generation began by managing single files on individual
171 computers. Although these tools represented a huge advance over
172 ad-hoc manual revision control, their locking model and reliance on a
173 single computer limited them to small, tightly-knit teams.
174
175 The second generation loosened these constraints by moving to
176 network-centered architectures, and managing entire projects at a
177 time. As projects grew larger, they ran into new problems. With
178 clients needing to talk to servers very frequently, server scaling
179 became an issue for large projects. An unreliable network connection
180 could prevent remote users from being able to talk to the server at
181 all. As open source projects started making read-only access
182 available anonymously to anyone, people without commit privileges
183 found that they could not use the tools to interact with a project in
184 a natural way, as they could not record their changes.
185
186 The current generation of revision control tools is peer-to-peer in
187 nature. All of these systems have dropped the dependency on a single
188 central server, and allow people to distribute their revision control
189 data to where it's actually needed. Collaboration over the Internet
190 has moved from constrained by technology to a matter of choice and
191 consensus. Modern tools can operate offline indefinitely and
192 autonomously, with a network connection only needed when syncing
193 changes with another repository.
194
195 \section{A few of the advantages of distributed revision control}
196
197 Even though distributed revision control tools have for several years
198 been as robust and usable as their previous-generation counterparts,
199 people using older tools have not yet necessarily woken up to their
200 advantages. There are a number of ways in which distributed tools
201 shine relative to centralised ones.
202
203 For an individual developer, distributed tools are almost always much
204 faster than centralised tools. This is for a simple reason: a
205 centralised tool needs to talk over the network for many common
206 operations, because most metadata is stored in a single copy on the
207 central server. A distributed tool stores all of its metadata
208 locally. All else being equal, talking over the network adds overhead
209 to a centralised tool. Don't underestimate the value of a snappy,
210 responsive tool: you're going to spend a lot of time interacting with
211 your revision control software.
212
213 Distributed tools are indifferent to the vagaries of your server
214 infrastructure, again because they replicate metadata to so many
215 locations. If you use a centralised system and your server catches
216 fire, you'd better hope that your backup media are reliable, and that
217 your last backup was recent and actually worked. With a distributed
218 tool, you have many backups available on every contributor's computer.
219
220 The reliability of your network will affect distributed tools far less
221 than it will centralised tools. You can't even use a centralised tool
222 without a network connection, except for a few highly constrained
223 commands. With a distributed tool, if your network connection goes
224 down while you're working, you may not even notice. The only thing
225 you won't be able to do is talk to repositories on other computers,
226 something that is relatively rare compared with local operations. If
227 you have a far-flung team of collaborators, this may be significant.
228
229 If you take a shine to an open source project and decide that you
230 would like to start hacking on it, and that project uses a distributed
231 revision control tool, you are at once a peer with the people who
232 consider themselves the ``core'' of that project. If they publish
233 their repositories, you can immediately copy their project history,
234 start making changes, and record your work, using the same tools in
235 the same ways as insiders. By contrast, with a centralised tool, you
236 must use the software in a ``read only'' mode unless someone grants
237 you permission to commit changes to their central server. Until then,
238 you won't be able to record changes, and your local modifications will
239 be at risk of corruption any time you try to update your client's view
240 of the repository.
170 241
171 \subsection{For open source projects} 242 \subsection{For open source projects}
172 243
173 \subsection{For commercial projects} 244 \subsection{For commercial projects}
174 245
175 \subsection{Myths about distributed revision control} 246 \subsection{Myths about distributed revision control}
247
248 \subsubsection{Distributed tools encourage projects to fork}
176 249
177 \section{Why choose Mercurial?} 250 \section{Why choose Mercurial?}
178 251
179 252
180 %%% Local Variables: 253 %%% Local Variables: