comparison en/concepts.tex @ 112:2fcead053b7a

More. Concept. Fun.
author Bryan O'Sullivan <bos@serpentine.com>
date Mon, 13 Nov 2006 13:21:29 -0800
parents 34b8b7a15ea1
children a0f57b3e677e
comparison
equal deleted inserted replaced
111:34b8b7a15ea1 112:2fcead053b7a
9 This understanding gives me confidence that Mercurial has been 9 This understanding gives me confidence that Mercurial has been
10 carefully designed to be both \emph{safe} and \emph{efficient}. And 10 carefully designed to be both \emph{safe} and \emph{efficient}. And
11 just as importantly, if it's easy for me to retain a good idea of what 11 just as importantly, if it's easy for me to retain a good idea of what
12 the software is doing when I perform a revision control task, I'm less 12 the software is doing when I perform a revision control task, I'm less
13 likely to be surprised by its behaviour. 13 likely to be surprised by its behaviour.
14
15 In this chapter, we'll initially cover the core concepts behind
16 Mercurial's design, then continue to discuss some of the interesting
17 details of its implementation.
14 18
15 \section{Mercurial's historical record} 19 \section{Mercurial's historical record}
16 20
17 \subsection{Tracking the history of a single file} 21 \subsection{Tracking the history of a single file}
18 22
172 video stream; the next delta is generated against that frame. This 176 video stream; the next delta is generated against that frame. This
173 means that if the video signal gets interrupted, it will resume once 177 means that if the video signal gets interrupted, it will resume once
174 the next key frame is received. Also, the accumulation of encoding 178 the next key frame is received. Also, the accumulation of encoding
175 errors restarts anew with each key frame. 179 errors restarts anew with each key frame.
176 180
177 \subsection{Strong integrity} 181 \subsection{Identification and strong integrity}
178 182
179 Along with delta or snapshot information, a revlog entry contains a 183 Along with delta or snapshot information, a revlog entry contains a
180 cryptographic hash of the data that it represents. This makes it 184 cryptographic hash of the data that it represents. This makes it
181 difficult to forge the contents of a revision, and easy to detect 185 difficult to forge the contents of a revision, and easy to detect
182 accidental corruption. The hash that Mercurial uses is SHA-1, which 186 accidental corruption.
183 is 160 bits long. Although all revision data is hashed, the changeset 187
188 Hashes provide more than a mere check against corruption; they are
189 used as the identifiers for revisions. The changeset identification
184 hashes that you see as an end user are from revisions of the 190 hashes that you see as an end user are from revisions of the
185 changelog. Manifest and file hashes are only used behind the scenes. 191 changelog. Although filelogs and the manifest also use hashes,
186 192 Mercurial only uses these behind the scenes.
187 Mercurial checks these hashes when retrieving file revisions and when 193
188 pulling changes from a repository. If it encounters an integrity 194 Mercurial verifies that hashes are correct when it retrieves file
189 problem, it will complain and stop whatever it's doing. 195 revisions and when it pulls changes from another repository. If it
196 encounters an integrity problem, it will complain and stop whatever
197 it's doing.
190 198
191 In addition to the effect it has on retrieval efficiency, Mercurial's 199 In addition to the effect it has on retrieval efficiency, Mercurial's
192 use of periodic snapshots makes it more robust against partial data 200 use of periodic snapshots makes it more robust against partial data
193 corruption. If a revlog becomes partly corrupted due to a hardware 201 corruption. If a revlog becomes partly corrupted due to a hardware
194 error or system bug, it's often possible to reconstruct some or most 202 error or system bug, it's often possible to reconstruct some or most
217 time has changed, but the size has not, only then does Mercurial need 225 time has changed, but the size has not, only then does Mercurial need
218 to read the actual contents of the file to see if they've changed. 226 to read the actual contents of the file to see if they've changed.
219 Storing these few extra pieces of information dramatically reduces the 227 Storing these few extra pieces of information dramatically reduces the
220 amount of data that Mercurial needs to read, which yields large 228 amount of data that Mercurial needs to read, which yields large
221 performance improvements compared to other revision control systems. 229 performance improvements compared to other revision control systems.
230
231 \section{Revision history, branching,
232 and merging}
233
234 Every entry in a Mercurial revlog knows the identity of its immediate
235 ancestor revision, usually referred to as its \emph{parent}. In fact,
236 a revision contains room for not one parent, but two. Mercurial uses
237 a special hash, called the ``null ID'', to represent the idea ``there
238 is no parent here''. This hash is simply a string of zeroes.
239
240 In figure~\ref{fig:concepts:revlog}, you can see an example of the
241 conceptual structure of a revlog. Filelogs, manifests, and changelogs
242 all have this same structure; they differ only in the kind of data
243 stored in each delta or snapshot.
244
245 The first revision in a revlog (at the bottom of the image) has the
246 null ID in both of its parent slots. For a ``normal'' revision, its
247 first parent slot contains the ID of its parent revision, and its
248 second contains the null ID, indicating that the revision has only one
249 real parent. Any two revisions that have the same parent ID are
250 branches. A revision that represents a merge between branches has two
251 normal revision IDs in its parent slots.
252
253 \begin{figure}[ht]
254 \centering
255 \grafix{revlog}
256 \caption{}
257 \label{fig:concepts:revlog}
258 \end{figure}
222 259
223 \section{Other interesting design features} 260 \section{Other interesting design features}
224 261
225 In the sections above, I've tried to highlight some of the most 262 In the sections above, I've tried to highlight some of the most
226 important aspects of Mercurial's design, to illustrate that it pays 263 important aspects of Mercurial's design, to illustrate that it pays