Mercurial > hgbook
changeset 781:93d19b27859c
started concepts.tex
author | Yoshiki Yazawa <yaz@honeyplanet.jp> |
---|---|
date | Tue, 12 May 2009 17:49:11 +0900 |
parents | 2a93b1d35990 |
children | 43556decb81f |
files | ja/concepts.tex ja/todo.txt |
diffstat | 2 files changed, 534 insertions(+), 372 deletions(-) [+] |
line wrap: on
line diff
--- a/ja/concepts.tex Sun May 03 17:10:09 2009 +0900 +++ b/ja/concepts.tex Tue May 12 17:49:11 2009 +0900 @@ -1,232 +1,364 @@ -\chapter{Behind the scenes} +%\chapter{Behind the scenes} +\chapter{$BIqBfN"(B} \label{chap:concepts} -Unlike many revision control systems, the concepts upon which -Mercurial is built are simple enough that it's easy to understand how -the software really works. Knowing this certainly isn't necessary, -but I find it useful to have a ``mental model'' of what's going on. +%Unlike many revision control systems, the concepts upon which +%Mercurial is built are simple enough that it's easy to understand how +%the software really works. Knowing this certainly isn't necessary, +%but I find it useful to have a ``mental model'' of what's going on. + +$BB?$/$N%j%S%8%g%s%3%s%H%m!<%k%7%9%F%`$H0c$C$F!$(B Mercurial$B$NF0:n$N4pK\$H$J$C(B +$B$F$$$k%3%s%;%W%H$rM}2r$9$k$3$H$OMF0W$$!%(B +$B$3$l$rM}2r$9$k$3$H$OI,$:$7$bI,MW$G$O$J$$$,!$I.<T$O!$2?$,5/$-$F$$$k$N$+$K(B +$B$D$$$F%b%G%k$r0U<1$7$F$$$k$3$H$OM-MQ$G$"$k$H9M$($F$$$k!%(B + +%This understanding gives me confidence that Mercurial has been +%carefully designed to be both \emph{safe} and \emph{efficient}. And +%just as importantly, if it's easy for me to retain a good idea of what +%the software is doing when I perform a revision control task, I'm less +%likely to be surprised by its behaviour. -This understanding gives me confidence that Mercurial has been -carefully designed to be both \emph{safe} and \emph{efficient}. And -just as importantly, if it's easy for me to retain a good idea of what -the software is doing when I perform a revision control task, I'm less -likely to be surprised by its behaviour. +$B$3$N$3$H$rM}2r$7$?8e$G!$I.<T$O(BMercurial$B$,(B\emph{$B0BA4(B}$B$H(B\emph{$B8zN((B}$B$r<B8=$9(B +$B$k$h$&$KCm0U?<$/@_7W$5$l$F$$$k$H3N?.$9$k$K;j$C$?!%(B +$B$^$?!$=EMW$JE@$H$7$F!$%j%S%8%g%s%3%s%H%m!<%k$NA`:n$r9T$&:]$K%=%U%H%&%'%"(B +$B$,2?$r$9$k$N$+$r5-21$KN1$a$F$*$/$3$H$K$h$C$F!$IT0U$N5sF0$G6C$/$3$H$,>/$J(B +$B$/$J$C$?!%(B -In this chapter, we'll initially cover the core concepts behind -Mercurial's design, then continue to discuss some of the interesting -details of its implementation. +%In this chapter, we'll initially cover the core concepts behind +%Mercurial's design, then continue to discuss some of the interesting +%details of its implementation. -\section{Mercurial's historical record} +$B$3$N>O$G$O$^$:(BMercurial$B$N@_7W$N%3%"%3%s%;%W%H$r%+%P!<$9$k!%$=$7$F<BAu>e$N(B +$B$$$/$D$+$N6=L#?<$$E@$N>\:Y$K$D$$$F5DO@$9$k!%(B + +%\section{Mercurial's historical record} +\section{Mercurial$B$NMzNr5-O?(B} -\subsection{Tracking the history of a single file} +%\subsection{Tracking the history of a single file} +\subsection{$B%U%!%$%kMzNr$NDI@W(B} + +%When Mercurial tracks modifications to a file, it stores the history +%of that file in a metadata object called a \emph{filelog}. Each entry +%in the filelog contains enough information to reconstruct one revision +%of the file that is being tracked. Filelogs are stored as files in +%the \sdirname{.hg/store/data} directory. A filelog contains two kinds +%of information: revision data, and an index to help Mercurial to find +%a revision efficiently. -When Mercurial tracks modifications to a file, it stores the history -of that file in a metadata object called a \emph{filelog}. Each entry -in the filelog contains enough information to reconstruct one revision -of the file that is being tracked. Filelogs are stored as files in -the \sdirname{.hg/store/data} directory. A filelog contains two kinds -of information: revision data, and an index to help Mercurial to find -a revision efficiently. +Mercurial$B$O%U%!%$%k$X$NJQ99$rDI@W$9$k;~!$%U%!%$%k$NMzNr$r(B\emph{filelog}$B$H(B +$B8F$P$l$k%a%?%G!<%?%*%V%8%'%/%H$K3JG<$9$k!%%U%!%$%k%m%0Fb$N3F!9$N%(%s%H%j(B +$B$O!$DI@WBP>]$N%U%!%$%k$N%j%S%8%g%s$r:F7z$9$k$N$K==J,$J>pJs$r;}$D!%%U%!%$(B +$B%k%m%0$O(B\sdirname{.hg/store/data}$B%G%#%l%/%H%j$K%U%!%$%k$H$7$FJ]B8$5$l$k!%(B +$B%U%!%$%k%m%0$O%j%S%8%g%s%G!<%?$H(BMercurial$B$,%j%S%8%g%s$r8zN(E*$K8+$D$1$i$l(B +$B$k$h$&$K$9$k$?$a$N%$%s%G%C%/%9$N(B2$B<oN`$N>pJs$r;}$D!%(B -A file that is large, or has a lot of history, has its filelog stored -in separate data (``\texttt{.d}'' suffix) and index (``\texttt{.i}'' -suffix) files. For small files without much history, the revision -data and index are combined in a single ``\texttt{.i}'' file. The -correspondence between a file in the working directory and the filelog -that tracks its history in the repository is illustrated in -figure~\ref{fig:concepts:filelog}. +%A file that is large, or has a lot of history, has its filelog stored +%in separate data (``\texttt{.d}'' suffix) and index (``\texttt{.i}'' +%suffix) files. For small files without much history, the revision +%data and index are combined in a single ``\texttt{.i}'' file. The +%correspondence between a file in the working directory and the filelog +%that tracks its history in the repository is illustrated in +%figure~\ref{fig:concepts:filelog}. + +$B%5%$%:$NBg$-$J%U%!%$%k$d!$KDBg$JMzNr$r;}$D%U%!%$%k$O!$%G!<%?$,(B +(``\texttt{.d}'' suffix) $B$*$h$S%$%s%G%C%/%9(B (``\texttt{.i}'' suffix)$B$N%U%!(B +$B%$%k$KJ,3d$5$l$?(Bfilelog$B$r;}$D!%%5%$%:$,>.$5$/!$MzNr$NBg$-$/$J$$%U%!%$%k$O(B +$B%j%S%8%g%s%G!<%?$H%$%s%G%C%/%9$,(B1$B$D$N(B``\texttt{.i}''$B%U%!%$%k$K7k9g$5$l$F(B +$B$$$k!%%o!<%-%s%0%G%#%l%/%H%jFb$N%U%!%$%k$H%j%]%8%H%jFb$NMzNr$rDI@W$9$k(B +filelog$B$H$NBP1~$r?^(B~\ref{fig:concepts:filelog}$B$K<($9!%(B \begin{figure}[ht] \centering % \grafix{filelog} \includegraphics{filelog} - \caption{Relationships between files in working directory and - filelogs in repository} +% \caption{Relationships between files in working directory and +% filelogs in repository} + \caption{$B%o!<%-%s%0%G%#%l%/%H%jFb$N%U%!%$%k$H%j%]%8%H%j$N%U%!%$%k%m%0(B + $B$N4X78(B} \label{fig:concepts:filelog} \end{figure} -\subsection{Managing tracked files} +%\subsection{Managing tracked files} +\subsection{$BDI@W$5$l$F$$$k%U%!%$%k$N4IM}(B} + +%Mercurial uses a structure called a \emph{manifest} to collect +%together information about the files that it tracks. Each entry in +%the manifest contains information about the files present in a single +%changeset. An entry records which files are present in the changeset, +%the revision of each file, and a few other pieces of file metadata. -Mercurial uses a structure called a \emph{manifest} to collect -together information about the files that it tracks. Each entry in -the manifest contains information about the files present in a single -changeset. An entry records which files are present in the changeset, -the revision of each file, and a few other pieces of file metadata. +Mercurial$B$O(B\emph{$B%^%K%U%'%9%H(B}$B$H8F$P$l$k9=B$$rMQ$$$F!$DI@W$9$Y$-%U%!%$%k(B +$B$N>pJs$r=8$a$F$$$k!%%^%K%U%'%9%HFb$N3F!9$N%(%s%H%j$O!$C10l$N%A%'%s%8%;%C(B +$B%HFb$KB8:_$9$k%U%!%$%k$N>pJs$r;}$C$F$$$k!%%(%s%H%j$O$I$N%U%!%$%k$,%A%'%s(B +$B%8%;%C%H$KB8:_$7$F$$$k$+!$$=$l$i$N%j%S%8%g%s$,2?$G$"$k$N$+$H$$$&>pJs$H!$(B +$B$$$/$D$+$NB>$N%U%!%$%k%a%?%G!<%?$r5-O?$7$F$$$k!%(B + +%\subsection{Recording changeset information} +\subsection{$B%A%'%s%8%;%C%H>pJs$N5-O?(B} -\subsection{Recording changeset information} +%The \emph{changelog} contains information about each changeset. Each +%revision records who committed a change, the changeset comment, other +%pieces of changeset-related information, and the revision of the +%manifest to use. -The \emph{changelog} contains information about each changeset. Each -revision records who committed a change, the changeset comment, other -pieces of changeset-related information, and the revision of the -manifest to use. +\emph{changelog}$B$O3F!9$N%A%'%s%8%;%C%H$N>pJs$r;}$D!%3F!9$N%j%S%8%g%s5-O?(B +$B$O!$C/$,JQ99$r%3%_%C%H$7$?$N$+!$%A%'%s%8%;%C%H$N%3%a%s%H!$JQ99$K4XO"$7$?(B +$BB>$N>pJs!$;HMQ$5$l$k%^%K%U%'%9%H$N%j%S%8%g%s$r;}$D!%(B -\subsection{Relationships between revisions} +%\subsection{Relationships between revisions} +\subsection{$B%j%S%8%g%s4V$N4X78(B} + +%Within a changelog, a manifest, or a filelog, each revision stores a +%pointer to its immediate parent (or to its two parents, if it's a +%merge revision). As I mentioned above, there are also relationships +%between revisions \emph{across} these structures, and they are +%hierarchical in nature. -Within a changelog, a manifest, or a filelog, each revision stores a -pointer to its immediate parent (or to its two parents, if it's a -merge revision). As I mentioned above, there are also relationships -between revisions \emph{across} these structures, and they are -hierarchical in nature. +$B%A%'%s%8%m%0!$%^%K%U%'%9%H!$%U%!%$%k%m%0Fb$G!$3F!9$N%j%S%8%g%s$OD>@\$N?F(B +$B!J$"$k$$$O%^!<%8$N>l9g$O$=$NN>?F!K$X$N%]%$%s%?$r;}$D!%(B +$B$9$G$K=R$Y$?$h$&$K!$$3$N9=B$$K8=$l$k%j%S%8%g%s$N4V$K$O4X78$,$"$j!$K\<A(B +$BE*$K3,AXE*$G$"$k!%(B -For every changeset in a repository, there is exactly one revision -stored in the changelog. Each revision of the changelog contains a -pointer to a single revision of the manifest. A revision of the -manifest stores a pointer to a single revision of each filelog tracked -when that changeset was created. These relationships are illustrated -in figure~\ref{fig:concepts:metadata}. +%For every changeset in a repository, there is exactly one revision +%stored in the changelog. Each revision of the changelog contains a +%pointer to a single revision of the manifest. A revision of the +%manifest stores a pointer to a single revision of each filelog tracked +%when that changeset was created. These relationships are illustrated +%in figure~\ref{fig:concepts:metadata}. + +$B%j%]%8%H%j$K$"$k$9$Y$F$N%A%'%s%8%;%C%H$O!$%A%'%s%8%m%0Fb$G@53N$K(B1$B$D$N%j%S(B +$B%8%g%s$r;}$D!%%A%'%s%8%m%0$N3F!9$N%j%S%8%g%s$O!$%^%K%U%'%9%H$NC10l$N%P!<(B +$B%8%g%s$X$N%]%$%s%?$r;}$D!%%^%K%U%'%9%H$N3F!9$N%j%S%8%g%s$O!$%A%'%s%8%;%C(B +$B%H$,@8@.$5$l$?;~$N%U%!%$%k%m%0$N3F!9$NC10l%j%S%8%g%s$X$N%]%$%s%?$r;}$D!%(B +$B$3$N4X78$r?^(B~\ref{fig:concepts:metadata}$B$K<($9!%(B \begin{figure}[ht] \centering \includegraphics{metadata} - \caption{Metadata relationships} +% \caption{Metadata relationships} + \caption{$B%a%?%G!<%?$N4X78(B} \label{fig:concepts:metadata} \end{figure} -As the illustration shows, there is \emph{not} a ``one to one'' -relationship between revisions in the changelog, manifest, or filelog. -If the manifest hasn't changed between two changesets, the changelog -entries for those changesets will point to the same revision of the -manifest. If a file that Mercurial tracks hasn't changed between two -changesets, the entry for that file in the two revisions of the -manifest will point to the same revision of its filelog. +%As the illustration shows, there is \emph{not} a ``one to one'' +%relationship between revisions in the changelog, manifest, or filelog. +%If the manifest hasn't changed between two changesets, the changelog +%entries for those changesets will point to the same revision of the +%manifest. If a file that Mercurial tracks hasn't changed between two +%changesets, the entry for that file in the two revisions of the +%manifest will point to the same revision of its filelog. -\section{Safe, efficient storage} +$B%$%i%9%H$G<($5$l$F$$$k$h$&$K!$%j%S%8%g%s$H%A%'%s%8%m%0!$%^%K%U%'%9%H$^$?(B +$B$O%U%!%$%k%m%0$N4V$K$O(B1$BBP#1$N4X78$O$J$$!%%^%K%U%'%9%H$,(B2$B$D$N%A%'%s%8%;%C(B +$B%H$N4V$GJQ2=$7$J$+$C$?>l9g$O!$%A%'%s%8%m%0Fb$G$3$l$i$N%A%'%s%8%;%C%H$rI=(B +$B$9%(%s%H%j$OF1$8%P!<%8%g%s$N%^%K%U%'%9%H$r<($9!%(BMercurial$B$,DI@W$9$k%U%!%$(B +$B%k$,!$(B2$B$D$N%A%'%s%8%;%C%H4V$GJQ2=$7$J$+$C$?>l9g$O!$(B 2$B$D$N%^%K%U%'%9%H$N%j(B +$B%S%8%g%s$G!$$=$N%U%!%$%k$r<($9%(%s%H%j$O%U%!%$%k%m%0$NF1$8%j%S%8%g%s$r<((B +$B$9!%(B -The underpinnings of changelogs, manifests, and filelogs are provided -by a single structure called the \emph{revlog}. +%\section{Safe, efficient storage} +\section{$B0BA4$+$D8zN(E*$J%9%H%l!<%8(B} + +%The underpinnings of changelogs, manifests, and filelogs are provided +%by a single structure called the \emph{revlog}. -\subsection{Efficient storage} +$B%A%'%s%8%m%0!$%^%K%U%'%9%H!$%U%!%$%k%m%0$NEZBf$K;H$o$l$F$$$k$O6&DL$N(B +\emph{revlog}$B$H$$$&9=B$BN$G$"$k!%(B + +%\subsection{Efficient storage} +\subsection{$B8zN(E*$J%9%H%l!<%8(B} -The revlog provides efficient storage of revisions using a -\emph{delta} mechanism. Instead of storing a complete copy of a file -for each revision, it stores the changes needed to transform an older -revision into the new revision. For many kinds of file data, these -deltas are typically a fraction of a percent of the size of a full -copy of a file. +%The revlog provides efficient storage of revisions using a +%\emph{delta} mechanism. Instead of storing a complete copy of a file +%for each revision, it stores the changes needed to transform an older +%revision into the new revision. For many kinds of file data, these +%deltas are typically a fraction of a percent of the size of a full +%copy of a file. + +revlog$B$O(B\emph{delta}$B5!9=$r;H$C$F%j%S%8%g%s$N8zN(E*$J5-21$rDs6!$9$k!%%U%!(B +$B%$%k$N3F!9$N%P!<%8%g%s$N40A4$J%3%T!<$rJ]B8$9$k$N$G$O$J$/!$8E$$%j%S%8%g%s(B +$B$r?7$7$$%P!<%8%g%s$XJQ49$9$k$N$KI,MW$JJQ99$rJ]B8$9$k!%B?$/$N%U%!%$%k%G!<(B +$B%?$KBP$7$F!$(B delta$B$OE57?E*$K$O%U%!%$%k$N%U%k%3%T!<$N(B1$B%Q!<%;%s%HL$K~$G$"$"(B +$B$k!%(B -Some obsolete revision control systems can only work with deltas of -text files. They must either store binary files as complete snapshots -or encoded into a text representation, both of which are wasteful -approaches. Mercurial can efficiently handle deltas of files with -arbitrary binary contents; it doesn't need to treat text as special. +%Some obsolete revision control systems can only work with deltas of +%text files. They must either store binary files as complete snapshots +%or encoded into a text representation, both of which are wasteful +%approaches. Mercurial can efficiently handle deltas of files with +%arbitrary binary contents; it doesn't need to treat text as special. -\subsection{Safe operation} +$B8E$$%j%S%8%g%s%3%s%H%m!<%k%7%9%F%`$N$$$/$D$+$O%F%-%9%H%U%!%$%k$N(Bdelta$B$KBP(B +$B$7$F$7$+5!G=$7$J$$!%$=$l$i$N%7%9%F%`$G$O%P%$%J%j%U%!%$%k$O40A4$J%9%J%C%W(B +$B%7%g%C%H$+!$%F%-%9%HI=8=$K%(%s%3!<%I$5$l$?7A<0$G$"$kI,MW$,$"$k!%$3$l$i$O(B +$B6&$KL5BL$NB?$$%"%W%m!<%A$G$"$k!%(B Mercurial$B$OG$0U$N%P%$%J%j%U%!%$%k$K$D$$(B +$B$F!$(Bdelta$B$r8zN(E*$K07$&$3$H$,$G$-!$%F%-%9%H$rFCJL07$$$9$kI,MW$,$J$$!%(B + +%\subsection{Safe operation} +\subsection{$B0BA4$JF0:n(B} \label{sec:concepts:txn} -Mercurial only ever \emph{appends} data to the end of a revlog file. -It never modifies a section of a file after it has written it. This -is both more robust and efficient than schemes that need to modify or -rewrite data. +%Mercurial only ever \emph{appends} data to the end of a revlog file. +%It never modifies a section of a file after it has written it. This +%is both more robust and efficient than schemes that need to modify or +%rewrite data. + +Mercurial$B$O(Brevlog$B%U%!%$%k$NKvHx$K%G!<%?$N(B\emph{$BDI2C(B}$B$N$_$r9T$&!%0lEY=q$-(B +$B9~$^$l$?ItJ,$O8e$K$J$C$FJQ99$5$l$k$3$H$O$J$$!%$3$l$O%G!<%?$NJQ99$d:F=q$-(B +$B9~$_$r9T$&J}K!$h$j$b4h6/$+$D8zN(E*$G$"$k!%(B -In addition, Mercurial treats every write as part of a -\emph{transaction} that can span a number of files. A transaction is -\emph{atomic}: either the entire transaction succeeds and its effects -are all visible to readers in one go, or the whole thing is undone. -This guarantee of atomicity means that if you're running two copies of -Mercurial, where one is reading data and one is writing it, the reader -will never see a partially written result that might confuse it. +%In addition, Mercurial treats every write as part of a +%\emph{transaction} that can span a number of files. A transaction is +%\emph{atomic}: either the entire transaction succeeds and its effects +%are all visible to readers in one go, or the whole thing is undone. +%This guarantee of atomicity means that if you're running two copies of +%Mercurial, where one is reading data and one is writing it, the reader +%will never see a partially written result that might confuse it. + +$B2C$($F!$(BMercurial$B$O$"$i$f$k=q$-9~$_$rJ#?t$N%U%!%$%k$X$N(B\emph{$B%H%i%s%6%/%7%g(B +$B%s(B}$B$N0lIt$H8+$J$9!%%H%i%s%6%/%7%g%sA4BN$,@.8y$7!$FI$_=P$7B&$K0lEY$K7k2L$,(B +$B8+$($k>l9g$b!$$=$&$G$J$$>l9g$b%H%i%s%6%/%7%g%s$O(B\emph{$B%"%H%_%C%/(B}$B$G$"$k!%(B +$B$3$N%"%H%_%C%/@-$NJ]>Z$O!$(B 2$B$D$N(BMercurial$B$r!$JRJ}$O%G!<%?$NFI$_=P$7!$$b$&(B +$B0lJ}$O=q$-9~$_$G<B9T$7$F$$$k>l9g!$FI$_=P$7B&$K$O:.Mp$N860x$H$J$kItJ,E*$J(B +$B=q$-9~$_7k2L$O8+$($J$$$3$H$r0UL#$9$k!%(B -The fact that Mercurial only appends to files makes it easier to -provide this transactional guarantee. The easier it is to do stuff -like this, the more confident you should be that it's done correctly. +%The fact that Mercurial only appends to files makes it easier to +%provide this transactional guarantee. The easier it is to do stuff +%like this, the more confident you should be that it's done correctly. -\subsection{Fast retrieval} +Mercurial$B$O%U%!%$%k$KDI5-$N$_$r$9$k$3$H$G!$%H%i%s%6%/%7%g%s$NJ]>Z$rMF0W$K(B +$B$7$F$$$k!%J*;v$rC1=c2=$9$k$3$H$K$h$C$F!$=hM}$N@5$7$5$r3N<B$K$9$k$h$&$K$7(B +$B$F$$$k!%(B + +%\subsection{Fast retrieval} +\subsection{$B9bB.$J<hF@(B} -Mercurial cleverly avoids a pitfall common to all earlier -revision control systems: the problem of \emph{inefficient retrieval}. -Most revision control systems store the contents of a revision as an -incremental series of modifications against a ``snapshot''. To -reconstruct a specific revision, you must first read the snapshot, and -then every one of the revisions between the snapshot and your target -revision. The more history that a file accumulates, the more -revisions you must read, hence the longer it takes to reconstruct a -particular revision. +%Mercurial cleverly avoids a pitfall common to all earlier +%revision control systems: the problem of \emph{inefficient retrieval}. +%Most revision control systems store the contents of a revision as an +%incremental series of modifications against a ``snapshot''. To +%reconstruct a specific revision, you must first read the snapshot, and +%then every one of the revisions between the snapshot and your target +%revision. The more history that a file accumulates, the more +%revisions you must read, hence the longer it takes to reconstruct a +%particular revision. + +Mercurial$B$O$3$l$^$G$N%j%S%8%g%s%3%s%H%m!<%k%7%9%F%`$K6&DL$7$?Mn$H$77j$r(B +$B8-L@$KHr$1$F$$$k!%LdBj$@$C$?$N$O(B\emph{$BHs8zN(E*$J<hF@(B}$B$G$"$C$?!%(B +$BBgDq$N%j%S%8%g%s%3%s%H%m!<%k%7%9%F%`$O%j%S%8%g%s$NFbMF$r%9%J%C%W%7%g%C%H(B +$B$X$N0lO"$NJQ99$NA}J,$H$7$FJ]B8$7$F$$$k!%FCDj$N%j%S%8%g%s$r:F8=$9$k$?$a$K(B +$B$O$^$:%9%J%C%W%7%g%C%H$rFI$_9~$_!$$7$+$k8e$KL\E*$N%j%S%8%g%s$rFI$`I,MW$,(B +$B$"$C$?!%%U%!%$%k$X$NMzNr$,A}$($l$PA}$($k$[$IFI$_9~$^$J$1$l$P$J$i$J$$%j%S(B +$B%8%g%s$,B?$/$J$j!$FCDj$N%j%S%8%g%s$r:F8=$9$k$N$K;~4V$,$+$+$k$h$&$K$J$k!%(B \begin{figure}[ht] \centering \includegraphics{snapshot} - \caption{Snapshot of a revlog, with incremental deltas} +% \caption{Snapshot of a revlog, with incremental deltas} + \caption{$B:9J,$rMQ$$$?(Brevlog$B$N%9%J%C%W%7%g%C%H(B} \label{fig:concepts:snapshot} \end{figure} -The innovation that Mercurial applies to this problem is simple but -effective. Once the cumulative amount of delta information stored -since the last snapshot exceeds a fixed threshold, it stores a new -snapshot (compressed, of course), instead of another delta. This -makes it possible to reconstruct \emph{any} revision of a file -quickly. This approach works so well that it has since been copied by -several other revision control systems. +%The innovation that Mercurial applies to this problem is simple but +%effective. Once the cumulative amount of delta information stored +%since the last snapshot exceeds a fixed threshold, it stores a new +%snapshot (compressed, of course), instead of another delta. This +%makes it possible to reconstruct \emph{any} revision of a file +%quickly. This approach works so well that it has since been copied by +%several other revision control systems. -Figure~\ref{fig:concepts:snapshot} illustrates the idea. In an entry -in a revlog's index file, Mercurial stores the range of entries from -the data file that it must read to reconstruct a particular revision. +Mercurial$B$G$O$3$NLdBj$rC1=c$@$,8z2LE*$JJ}K!$G2r7h$7$F$$$k!%A02s$K%9%J%C%W(B +$B%7%g%C%H$r:n@.$7$?;~E@$+$i$NN_@QE*$JA}J,$,0lDj$NogCM$r1[$($k$H!$?7$?$JA}(B +$BJ,$G$O$J$/!$?7$?$J%9%J%C%W%7%g%C%H$,J]B8$5$l$k!J$3$N%9%J%C%W%7%g%C%H$OL^(B +$BO@05=L$5$l$F$$$k!K!%$3$l$K$h$C$F(B\emph{$B$$$+$J$k(B}$B%j%S%8%g%s$b?WB.$K:F8=$5$l(B +$B$k!%$3$N%"%W%m!<%A$O0J8eB>$N%j%S%8%g%s%3%s%H%m!<%k%7%9%F%`$K%3%T!<$5$l$k(B +$B$[$I$&$^$/5!G=$7$F$$$k!%(B -\subsubsection{Aside: the influence of video compression} +%Figure~\ref{fig:concepts:snapshot} illustrates the idea. In an entry +%in a revlog's index file, Mercurial stores the range of entries from +%the data file that it must read to reconstruct a particular revision. -If you're familiar with video compression or have ever watched a TV -feed through a digital cable or satellite service, you may know that -most video compression schemes store each frame of video as a delta -against its predecessor frame. In addition, these schemes use -``lossy'' compression techniques to increase the compression ratio, so -visual errors accumulate over the course of a number of inter-frame -deltas. +$B?^(B~\ref{fig:concepts:snapshot}$B$K35G0$r<($9!%(BMercurial$B$O(Brevlog$B$N%$%s%G%C%/(B +$B%9%U%!%$%k$N%(%s%H%j$KFCDj$N%j%S%8%g%s$r:F8=$9$k$N$KI,MW$J%G!<%?%U%!%$%k(B +$BFb$N$"$kHO0O$N%(%s%H%j$rJ]B8$9$k!%(B + + + + + + + + -Because it's possible for a video stream to ``drop out'' occasionally -due to signal glitches, and to limit the accumulation of artefacts -introduced by the lossy compression process, video encoders -periodically insert a complete frame (called a ``key frame'') into the -video stream; the next delta is generated against that frame. This -means that if the video signal gets interrupted, it will resume once -the next key frame is received. Also, the accumulation of encoding -errors restarts anew with each key frame. + + +%\subsubsection{Aside: the influence of video compression} +\subsubsection{$B$=$NB>(B: $B%S%G%*05=L$N1F6A(B} -\subsection{Identification and strong integrity} - -Along with delta or snapshot information, a revlog entry contains a -cryptographic hash of the data that it represents. This makes it -difficult to forge the contents of a revision, and easy to detect -accidental corruption. +%If you're familiar with video compression or have ever watched a TV +%feed through a digital cable or satellite service, you may know that +%most video compression schemes store each frame of video as a delta +%against its predecessor frame. In addition, these schemes use +%``lossy'' compression techniques to increase the compression ratio, so +%visual errors accumulate over the course of a number of inter-frame +%deltas. -Hashes provide more than a mere check against corruption; they are -used as the identifiers for revisions. The changeset identification -hashes that you see as an end user are from revisions of the -changelog. Although filelogs and the manifest also use hashes, -Mercurial only uses these behind the scenes. +%Because it's possible for a video stream to ``drop out'' occasionally +%due to signal glitches, and to limit the accumulation of artefacts +%introduced by the lossy compression process, video encoders +%periodically insert a complete frame (called a ``key frame'') into the +%video stream; the next delta is generated against that frame. This +%means that if the video signal gets interrupted, it will resume once +%the next key frame is received. Also, the accumulation of encoding +%errors restarts anew with each key frame. + +%\subsection{Identification and strong integrity} +\subsection{$B<1JL$H6/$$0l4S@-(B} -Mercurial verifies that hashes are correct when it retrieves file -revisions and when it pulls changes from another repository. If it -encounters an integrity problem, it will complain and stop whatever -it's doing. +%Along with delta or snapshot information, a revlog entry contains a +%cryptographic hash of the data that it represents. This makes it +%difficult to forge the contents of a revision, and easy to detect +%accidental corruption. -In addition to the effect it has on retrieval efficiency, Mercurial's -use of periodic snapshots makes it more robust against partial data -corruption. If a revlog becomes partly corrupted due to a hardware -error or system bug, it's often possible to reconstruct some or most -revisions from the uncorrupted sections of the revlog, both before and -after the corrupted section. This would not be possible with a -delta-only storage model. +%Hashes provide more than a mere check against corruption; they are +%used as the identifiers for revisions. The changeset identification +%hashes that you see as an end user are from revisions of the +%changelog. Although filelogs and the manifest also use hashes, +%Mercurial only uses these behind the scenes. + +%Mercurial verifies that hashes are correct when it retrieves file +%revisions and when it pulls changes from another repository. If it +%encounters an integrity problem, it will complain and stop whatever +%it's doing. -\section{Revision history, branching, - and merging} +%In addition to the effect it has on retrieval efficiency, Mercurial's +%use of periodic snapshots makes it more robust against partial data +%corruption. If a revlog becomes partly corrupted due to a hardware +%error or system bug, it's often possible to reconstruct some or most +%revisions from the uncorrupted sections of the revlog, both before and +%after the corrupted section. This would not be possible with a +%delta-only storage model. -Every entry in a Mercurial revlog knows the identity of its immediate -ancestor revision, usually referred to as its \emph{parent}. In fact, -a revision contains room for not one parent, but two. Mercurial uses -a special hash, called the ``null ID'', to represent the idea ``there -is no parent here''. This hash is simply a string of zeroes. +%\section{Revision history, branching, and merging} +\section{$B%j%S%8%g%sMzNr!$%V%i%s%A!$%^!<%8(B} -In figure~\ref{fig:concepts:revlog}, you can see an example of the -conceptual structure of a revlog. Filelogs, manifests, and changelogs -all have this same structure; they differ only in the kind of data -stored in each delta or snapshot. +%Every entry in a Mercurial revlog knows the identity of its immediate +%ancestor revision, usually referred to as its \emph{parent}. In fact, +%a revision contains room for not one parent, but two. Mercurial uses +%a special hash, called the ``null ID'', to represent the idea ``there +%is no parent here''. This hash is simply a string of zeroes. -The first revision in a revlog (at the bottom of the image) has the -null ID in both of its parent slots. For a ``normal'' revision, its -first parent slot contains the ID of its parent revision, and its -second contains the null ID, indicating that the revision has only one -real parent. Any two revisions that have the same parent ID are -branches. A revision that represents a merge between branches has two -normal revision IDs in its parent slots. +%In figure~\ref{fig:concepts:revlog}, you can see an example of the +%conceptual structure of a revlog. Filelogs, manifests, and changelogs +%all have this same structure; they differ only in the kind of data +%stored in each delta or snapshot. + +%The first revision in a revlog (at the bottom of the image) has the +%null ID in both of its parent slots. For a ``normal'' revision, its +%first parent slot contains the ID of its parent revision, and its +%second contains the null ID, indicating that the revision has only one +%real parent. Any two revisions that have the same parent ID are +%branches. A revision that represents a merge between branches has two +%normal revision IDs in its parent slots. \begin{figure}[ht] \centering @@ -235,137 +367,146 @@ \label{fig:concepts:revlog} \end{figure} -\section{The working directory} +%\section{The working directory} +\section{$B%o!<%-%s%0%G%#%l%/%H%j(B} -In the working directory, Mercurial stores a snapshot of the files -from the repository as of a particular changeset. +%In the working directory, Mercurial stores a snapshot of the files +%from the repository as of a particular changeset. -The working directory ``knows'' which changeset it contains. When you -update the working directory to contain a particular changeset, -Mercurial looks up the appropriate revision of the manifest to find -out which files it was tracking at the time that changeset was -committed, and which revision of each file was then current. It then -recreates a copy of each of those files, with the same contents it had -when the changeset was committed. +%The working directory ``knows'' which changeset it contains. When you +%update the working directory to contain a particular changeset, +%Mercurial looks up the appropriate revision of the manifest to find +%out which files it was tracking at the time that changeset was +%committed, and which revision of each file was then current. It then +%recreates a copy of each of those files, with the same contents it had +%when the changeset was committed. -The \emph{dirstate} contains Mercurial's knowledge of the working -directory. This details which changeset the working directory is -updated to, and all of the files that Mercurial is tracking in the -working directory. +%The \emph{dirstate} contains Mercurial's knowledge of the working +%directory. This details which changeset the working directory is +%updated to, and all of the files that Mercurial is tracking in the +%working directory. -Just as a revision of a revlog has room for two parents, so that it -can represent either a normal revision (with one parent) or a merge of -two earlier revisions, the dirstate has slots for two parents. When -you use the \hgcmd{update} command, the changeset that you update to -is stored in the ``first parent'' slot, and the null ID in the second. -When you \hgcmd{merge} with another changeset, the first parent -remains unchanged, and the second parent is filled in with the -changeset you're merging with. The \hgcmd{parents} command tells you -what the parents of the dirstate are. +%Just as a revision of a revlog has room for two parents, so that it +%can represent either a normal revision (with one parent) or a merge of +%two earlier revisions, the dirstate has slots for two parents. When +%you use the \hgcmd{update} command, the changeset that you update to +%is stored in the ``first parent'' slot, and the null ID in the second. +%When you \hgcmd{merge} with another changeset, the first parent +%remains unchanged, and the second parent is filled in with the +%changeset you're merging with. The \hgcmd{parents} command tells you +%what the parents of the dirstate are. -\subsection{What happens when you commit} +%\subsection{What happens when you commit} +\subsection{$B%3%_%C%H;~$K2?$,5/$-$k$N$+(B} -The dirstate stores parent information for more than just book-keeping -purposes. Mercurial uses the parents of the dirstate as \emph{the - parents of a new changeset} when you perform a commit. +%The dirstate stores parent information for more than just book-keeping +%purposes. Mercurial uses the parents of the dirstate as \emph{the +% parents of a new changeset} when you perform a commit. \begin{figure}[ht] \centering \includegraphics{wdir} - \caption{The working directory can have two parents} +% \caption{The working directory can have two parents} + \caption{$B%o!<%-%s%0%G%#%l%/%H%j$O(B2$B$D$N?F$r;}$AF@$k(B} \label{fig:concepts:wdir} \end{figure} -Figure~\ref{fig:concepts:wdir} shows the normal state of the working -directory, where it has a single changeset as parent. That changeset -is the \emph{tip}, the newest changeset in the repository that has no -children. +%Figure~\ref{fig:concepts:wdir} shows the normal state of the working +%directory, where it has a single changeset as parent. That changeset +%is the \emph{tip}, the newest changeset in the repository that has no +%children. \begin{figure}[ht] \centering \includegraphics{wdir-after-commit} - \caption{The working directory gains new parents after a commit} +% \caption{The working directory gains new parents after a commit} + \caption{$B%3%_%C%H8e!$%o!<%-%s%0%G%#%l%/%H%j$O?7$?$JN>?F$r;}$D(B} \label{fig:concepts:wdir-after-commit} \end{figure} -It's useful to think of the working directory as ``the changeset I'm -about to commit''. Any files that you tell Mercurial that you've -added, removed, renamed, or copied will be reflected in that -changeset, as will modifications to any files that Mercurial is -already tracking; the new changeset will have the parents of the -working directory as its parents. +%It's useful to think of the working directory as ``the changeset I'm +%about to commit''. Any files that you tell Mercurial that you've +%added, removed, renamed, or copied will be reflected in that +%changeset, as will modifications to any files that Mercurial is +%already tracking; the new changeset will have the parents of the +%working directory as its parents. -After a commit, Mercurial will update the parents of the working -directory, so that the first parent is the ID of the new changeset, -and the second is the null ID. This is shown in -figure~\ref{fig:concepts:wdir-after-commit}. Mercurial doesn't touch -any of the files in the working directory when you commit; it just -modifies the dirstate to note its new parents. +%After a commit, Mercurial will update the parents of the working +%directory, so that the first parent is the ID of the new changeset, +%and the second is the null ID. This is shown in +%figure~\ref{fig:concepts:wdir-after-commit}. Mercurial doesn't touch +%any of the files in the working directory when you commit; it just +%modifies the dirstate to note its new parents. -\subsection{Creating a new head} +%\subsection{Creating a new head} +\subsection{$B?7$?$J%X%C%I$r:n$k(B} -It's perfectly normal to update the working directory to a changeset -other than the current tip. For example, you might want to know what -your project looked like last Tuesday, or you could be looking through -changesets to see which one introduced a bug. In cases like this, the -natural thing to do is update the working directory to the changeset -you're interested in, and then examine the files in the working -directory directly to see their contents as they werea when you -committed that changeset. The effect of this is shown in -figure~\ref{fig:concepts:wdir-pre-branch}. +%It's perfectly normal to update the working directory to a changeset +%other than the current tip. For example, you might want to know what +%your project looked like last Tuesday, or you could be looking through +%changesets to see which one introduced a bug. In cases like this, the +%natural thing to do is update the working directory to the changeset +%you're interested in, and then examine the files in the working +%directory directly to see their contents as they werea when you +%committed that changeset. The effect of this is shown in +%figure~\ref{fig:concepts:wdir-pre-branch}. \begin{figure}[ht] \centering \includegraphics{wdir-pre-branch} - \caption{The working directory, updated to an older changeset} +% \caption{The working directory, updated to an older changeset} + \caption{$B8E$$%A%'%s%8%;%C%H$X$H99?7$5$l$?%o!<%-%s%0%G%#%l%/%H%j(B} \label{fig:concepts:wdir-pre-branch} \end{figure} -Having updated the working directory to an older changeset, what -happens if you make some changes, and then commit? Mercurial behaves -in the same way as I outlined above. The parents of the working -directory become the parents of the new changeset. This new changeset -has no children, so it becomes the new tip. And the repository now -contains two changesets that have no children; we call these -\emph{heads}. You can see the structure that this creates in -figure~\ref{fig:concepts:wdir-branch}. +%Having updated the working directory to an older changeset, what +%happens if you make some changes, and then commit? Mercurial behaves +%in the same way as I outlined above. The parents of the working +%directory become the parents of the new changeset. This new changeset +%has no children, so it becomes the new tip. And the repository now +%contains two changesets that have no children; we call these +%\emph{heads}. You can see the structure that this creates in +%figure~\ref{fig:concepts:wdir-branch}. \begin{figure}[ht] \centering \includegraphics{wdir-branch} - \caption{After a commit made while synced to an older changeset} +% \caption{After a commit made while synced to an older changeset} + \caption{$B8E$$%A%'%s%8%;%C%H$KF14|Cf$K%3%_%C%H$,9T$o$l$?>l9g(B} \label{fig:concepts:wdir-branch} \end{figure} -\begin{note} - If you're new to Mercurial, you should keep in mind a common - ``error'', which is to use the \hgcmd{pull} command without any - options. By default, the \hgcmd{pull} command \emph{does not} - update the working directory, so you'll bring new changesets into - your repository, but the working directory will stay synced at the - same changeset as before the pull. If you make some changes and - commit afterwards, you'll thus create a new head, because your - working directory isn't synced to whatever the current tip is. +%\begin{note} +% If you're new to Mercurial, you should keep in mind a common +% ``error'', which is to use the \hgcmd{pull} command without any +% options. By default, the \hgcmd{pull} command \emph{does not} +% update the working directory, so you'll bring new changesets into +% your repository, but the working directory will stay synced at the +% same changeset as before the pull. If you make some changes and +% commit afterwards, you'll thus create a new head, because your +% working directory isn't synced to whatever the current tip is. +% +% I put the word ``error'' in quotes because all that you need to do +% to rectify this situation is \hgcmd{merge}, then \hgcmd{commit}. In +% other words, this almost never has negative consequences; it just +% surprises people. I'll discuss other ways to avoid this behaviour, +% and why Mercurial behaves in this initially surprising way, later +% on. +%\end{note} - I put the word ``error'' in quotes because all that you need to do - to rectify this situation is \hgcmd{merge}, then \hgcmd{commit}. In - other words, this almost never has negative consequences; it just - surprises people. I'll discuss other ways to avoid this behaviour, - and why Mercurial behaves in this initially surprising way, later - on. -\end{note} +%\subsection{Merging heads} +\subsection{$B%X%C%I4V$N%^!<%8(B} -\subsection{Merging heads} - -When you run the \hgcmd{merge} command, Mercurial leaves the first -parent of the working directory unchanged, and sets the second parent -to the changeset you're merging with, as shown in -figure~\ref{fig:concepts:wdir-merge}. +%When you run the \hgcmd{merge} command, Mercurial leaves the first +%parent of the working directory unchanged, and sets the second parent +%to the changeset you're merging with, as shown in +%figure~\ref{fig:concepts:wdir-merge}. \begin{figure}[ht] \centering \includegraphics{wdir-merge} - \caption{Merging two heads} +% \caption{Merging two heads} + \caption{2$B$D$N%X%C%I$N%^!<%8(B} \label{fig:concepts:wdir-merge} \end{figure} @@ -409,7 +550,8 @@ changesets at once, the prospect of user confusion and making a terrible mess of a merge immediately becomes overwhelming. -\section{Other interesting design features} +%\section{Other interesting design features} +\section{$B@_7W$NB>$N6=L#?<$$E@(B} In the sections above, I've tried to highlight some of the most important aspects of Mercurial's design, to illustrate that it pays @@ -421,7 +563,8 @@ better idea of the amount of thinking that goes into a well-designed system. -\subsection{Clever compression} +%\subsection{Clever compression} +\subsection{$B8-$$05=L(B} When appropriate, Mercurial will store both snapshots and deltas in compressed form. It does this by always \emph{trying to} compress a @@ -441,138 +584,157 @@ the snapshot, again saving space compared to a naive delta-only approach. -\subsubsection{Network recompression} +%\subsubsection{Network recompression} +\subsubsection{$B%M%C%H%o!<%/:F05=L(B} -When storing revisions on disk, Mercurial uses the ``deflate'' -compression algorithm (the same one used by the popular \texttt{zip} -archive format), which balances good speed with a respectable -compression ratio. However, when transmitting revision data over a -network connection, Mercurial uncompresses the compressed revision -data. +%When storing revisions on disk, Mercurial uses the ``deflate'' +%compression algorithm (the same one used by the popular \texttt{zip} +%archive format), which balances good speed with a respectable +%compression ratio. However, when transmitting revision data over a +%network connection, Mercurial uncompresses the compressed revision +%data. + +%If the connection is over HTTP, Mercurial recompresses the entire +%stream of data using a compression algorithm that gives a better +%compression ratio (the Burrows-Wheeler algorithm from the widely used +%\texttt{bzip2} compression package). This combination of algorithm +%and compression of the entire stream (instead of a revision at a time) +%substantially reduces the number of bytes to be transferred, yielding +%better network performance over almost all kinds of network. -If the connection is over HTTP, Mercurial recompresses the entire -stream of data using a compression algorithm that gives a better -compression ratio (the Burrows-Wheeler algorithm from the widely used -\texttt{bzip2} compression package). This combination of algorithm -and compression of the entire stream (instead of a revision at a time) -substantially reduces the number of bytes to be transferred, yielding -better network performance over almost all kinds of network. +%(If the connection is over \command{ssh}, Mercurial \emph{doesn't} +%recompress the stream, because \command{ssh} can already do this +%itself.) + +%\subsection{Read/write ordering and atomicity} +\subsection{$BFI$_=q$-$N=g=x$H%"%H%_%C%/@-(B} -(If the connection is over \command{ssh}, Mercurial \emph{doesn't} -recompress the stream, because \command{ssh} can already do this -itself.) - -\subsection{Read/write ordering and atomicity} +%Appending to files isn't the whole story when it comes to guaranteeing +%that a reader won't see a partial write. If you recall +%figure~\ref{fig:concepts:metadata}, revisions in the changelog point to +%revisions in the manifest, and revisions in the manifest point to +%revisions in filelogs. This hierarchy is deliberate. -Appending to files isn't the whole story when it comes to guaranteeing -that a reader won't see a partial write. If you recall -figure~\ref{fig:concepts:metadata}, revisions in the changelog point to -revisions in the manifest, and revisions in the manifest point to -revisions in filelogs. This hierarchy is deliberate. +%A writer starts a transaction by writing filelog and manifest data, +%and doesn't write any changelog data until those are finished. A +%reader starts by reading changelog data, then manifest data, followed +%by filelog data. -A writer starts a transaction by writing filelog and manifest data, -and doesn't write any changelog data until those are finished. A -reader starts by reading changelog data, then manifest data, followed -by filelog data. +%Since the writer has always finished writing filelog and manifest data +%before it writes to the changelog, a reader will never read a pointer +%to a partially written manifest revision from the changelog, and it will +%never read a pointer to a partially written filelog revision from the +%manifest. -Since the writer has always finished writing filelog and manifest data -before it writes to the changelog, a reader will never read a pointer -to a partially written manifest revision from the changelog, and it will -never read a pointer to a partially written filelog revision from the -manifest. +%\subsection{Concurrent access} +\subsection{$BF1;~%"%/%;%9(B} -\subsection{Concurrent access} +%The read/write ordering and atomicity guarantees mean that Mercurial +%never needs to \emph{lock} a repository when it's reading data, even +%if the repository is being written to while the read is occurring. +%This has a big effect on scalability; you can have an arbitrary number +%of Mercurial processes safely reading data from a repository safely +%all at once, no matter whether it's being written to or not. -The read/write ordering and atomicity guarantees mean that Mercurial -never needs to \emph{lock} a repository when it's reading data, even -if the repository is being written to while the read is occurring. -This has a big effect on scalability; you can have an arbitrary number -of Mercurial processes safely reading data from a repository safely -all at once, no matter whether it's being written to or not. +%The lockless nature of reading means that if you're sharing a +%repository on a multi-user system, you don't need to grant other local +%users permission to \emph{write} to your repository in order for them +%to be able to clone it or pull changes from it; they only need +%\emph{read} permission. (This is \emph{not} a common feature among +%revision control systems, so don't take it for granted! Most require +%readers to be able to lock a repository to access it safely, and this +%requires write permission on at least one directory, which of course +%makes for all kinds of nasty and annoying security and administrative +%problems.) -The lockless nature of reading means that if you're sharing a -repository on a multi-user system, you don't need to grant other local -users permission to \emph{write} to your repository in order for them -to be able to clone it or pull changes from it; they only need -\emph{read} permission. (This is \emph{not} a common feature among -revision control systems, so don't take it for granted! Most require -readers to be able to lock a repository to access it safely, and this -requires write permission on at least one directory, which of course -makes for all kinds of nasty and annoying security and administrative -problems.) +%Mercurial uses locks to ensure that only one process can write to a +%repository at a time (the locking mechanism is safe even over +%filesystems that are notoriously hostile to locking, such as NFS). If +%a repository is locked, a writer will wait for a while to retry if the +%repository becomes unlocked, but if the repository remains locked for +%too long, the process attempting to write will time out after a while. +%This means that your daily automated scripts won't get stuck forever +%and pile up if a system crashes unnoticed, for example. (Yes, the +%timeout is configurable, from zero to infinity.) + +%\subsubsection{Safe dirstate access} +\subsubsection{$B0BA4$J(Bdirstate$B%"%/%;%9(B} -Mercurial uses locks to ensure that only one process can write to a -repository at a time (the locking mechanism is safe even over -filesystems that are notoriously hostile to locking, such as NFS). If -a repository is locked, a writer will wait for a while to retry if the -repository becomes unlocked, but if the repository remains locked for -too long, the process attempting to write will time out after a while. -This means that your daily automated scripts won't get stuck forever -and pile up if a system crashes unnoticed, for example. (Yes, the -timeout is configurable, from zero to infinity.) +%As with revision data, Mercurial doesn't take a lock to read the +%dirstate file; it does acquire a lock to write it. To avoid the +%possibility of reading a partially written copy of the dirstate file, +%Mercurial writes to a file with a unique name in the same directory as +%the dirstate file, then renames the temporary file atomically to +%\filename{dirstate}. The file named \filename{dirstate} is thus +%guaranteed to be complete, not partially written. -\subsubsection{Safe dirstate access} +%\subsection{Avoiding seeks} +\subsection{$B%7!<%/$N2sHr(B} + +%Critical to Mercurial's performance is the avoidance of seeks of the +%disk head, since any seek is far more expensive than even a +%comparatively large read operation. -As with revision data, Mercurial doesn't take a lock to read the -dirstate file; it does acquire a lock to write it. To avoid the -possibility of reading a partially written copy of the dirstate file, -Mercurial writes to a file with a unique name in the same directory as -the dirstate file, then renames the temporary file atomically to -\filename{dirstate}. The file named \filename{dirstate} is thus -guaranteed to be complete, not partially written. +Mercurial$B$N@-G=$K$O!$%G%#%9%/%X%C%I$N%7!<%/$rHr$1$k$3$H$,IT2D7g$G$"$k!%(B +$B%7!<%/$OBg5,LO$JFI$_=P$7A`:n$HHf3S$7$F$bHs>o$K9b$/$D$/!%(B + +%This is why, for example, the dirstate is stored in a single file. If +%there were a dirstate file per directory that Mercurial tracked, the +%disk would seek once per directory. Instead, Mercurial reads the +%entire single dirstate file in one step. -\subsection{Avoiding seeks} +$B$=$NM}M3$O!$Nc$($P!$(Bdirstate$B$O(B1$B$D$N%U%!%$%k$KJ]B8$5$l$F$$$k$?$a$G!$(B +Mercurial$B$,DI@W$7$F$$$k(Bdirstate$B%U%!%$%k$,%G%#%l%/%H%jKh$K$"$k(B +$B$H!$(BMercurial$B$O%G%#%l%/%H%jKh$K%7!<%/$r9T$&$3$H$K$J$k!%<B:]$K$O(BMercurial +$B$OA4BN$G0l$D$N(Bdirstate$B%U%!%$%k$r%o%s%9%F%C%W$GFI$`!%(B -Critical to Mercurial's performance is the avoidance of seeks of the -disk head, since any seek is far more expensive than even a -comparatively large read operation. - -This is why, for example, the dirstate is stored in a single file. If -there were a dirstate file per directory that Mercurial tracked, the -disk would seek once per directory. Instead, Mercurial reads the -entire single dirstate file in one step. +%Mercurial also uses a ``copy on write'' scheme when cloning a +%repository on local storage. Instead of copying every revlog file -Mercurial also uses a ``copy on write'' scheme when cloning a -repository on local storage. Instead of copying every revlog file -from the old repository into the new repository, it makes a ``hard -link'', which is a shorthand way to say ``these two names point to the -same file''. When Mercurial is about to write to one of a revlog's -files, it checks to see if the number of names pointing at the file is -greater than one. If it is, more than one repository is using the -file, so Mercurial makes a new copy of the file that is private to -this repository. +%from the old repository into the new repository, it makes a ``hard +%link'', which is a shorthand way to say ``these two names point to the +%same file''. When Mercurial is about to write to one of a revlog's +%files, it checks to see if the number of names pointing at the file is +%greater than one. If it is, more than one repository is using the +%file, so Mercurial makes a new copy of the file that is private to +%this repository. -A few revision control developers have pointed out that this idea of -making a complete private copy of a file is not very efficient in its -use of storage. While this is true, storage is cheap, and this method -gives the highest performance while deferring most book-keeping to the -operating system. An alternative scheme would most likely reduce -performance and increase the complexity of the software, each of which -is much more important to the ``feel'' of day-to-day use. +Mercurial$B$O%j%]%8%H%j$r%m!<%+%k%9%H%l!<%8$K%/%m!<%s$9$k:]$K(B``$B%3%T!<%*%s%i(B +$B%$%H(B''$B<jK!$r;H$C$F$$$k!%(B + -\subsection{Other contents of the dirstate} +%A few revision control developers have pointed out that this idea of +%making a complete private copy of a file is not very efficient in its +%use of storage. While this is true, storage is cheap, and this method +%gives the highest performance while deferring most book-keeping to the +%operating system. An alternative scheme would most likely reduce +%performance and increase the complexity of the software, each of which +%is much more important to the ``feel'' of day-to-day use. -Because Mercurial doesn't force you to tell it when you're modifying a -file, it uses the dirstate to store some extra information so it can -determine efficiently whether you have modified a file. For each file -in the working directory, it stores the time that it last modified the -file itself, and the size of the file at that time. +%\subsection{Other contents of the dirstate} +\subsection{dirstate$B$NB>$NFbMF(B} -When you explicitly \hgcmd{add}, \hgcmd{remove}, \hgcmd{rename} or -\hgcmd{copy} files, Mercurial updates the dirstate so that it knows -what to do with those files when you commit. +%Because Mercurial doesn't force you to tell it when you're modifying a +%file, it uses the dirstate to store some extra information so it can +%determine efficiently whether you have modified a file. For each file +%in the working directory, it stores the time that it last modified the +%file itself, and the size of the file at that time. -When Mercurial is checking the states of files in the working -directory, it first checks a file's modification time. If that has -not changed, the file must not have been modified. If the file's size -has changed, the file must have been modified. If the modification -time has changed, but the size has not, only then does Mercurial need -to read the actual contents of the file to see if they've changed. -Storing these few extra pieces of information dramatically reduces the -amount of data that Mercurial needs to read, which yields large -performance improvements compared to other revision control systems. +%When you explicitly \hgcmd{add}, \hgcmd{remove}, \hgcmd{rename} or +%\hgcmd{copy} files, Mercurial updates the dirstate so that it knows +%what to do with those files when you commit. -%%% Local Variables: +%When Mercurial is checking the states of files in the working +%directory, it first checks a file's modification time. If that has +%not changed, the file must not have been modified. If the file's size +%has changed, the file must have been modified. If the modification +%time has changed, but the size has not, only then does Mercurial need +%to read the actual contents of the file to see if they've changed. +%Storing these few extra pieces of information dramatically reduces the +%amount of data that Mercurial needs to read, which yields large +%performance improvements compared to other revision control systems. + +%%% Local Variables: %%% mode: yatex %%% TeX-master: "00book" %%% End:
--- a/ja/todo.txt Sun May 03 17:10:09 2009 +0900 +++ b/ja/todo.txt Tue May 12 17:49:11 2009 +0900 @@ -2,12 +2,12 @@ 00book.tex 100% branch.tex 100% collab.tex 100% -concepts.tex +concepts.tex 20% daily.tex 100% filenames.tex 100% hg_id.tex noneed hgext.tex 100% -hook.tex 80% +hook.tex 100% intro.tex 100% license.tex - mq-collab.tex 100%