Mercurial > hgbook
comparison en/hgext.tex @ 224:34943a3d50d6
Start writing up extensions. Begin with inotify.
author | Bryan O'Sullivan <bos@serpentine.com> |
---|---|
date | Tue, 15 May 2007 16:24:20 -0700 |
parents | 4c9b9416cd23 |
children | eef2171243e8 |
comparison
equal
deleted
inserted
replaced
223:4c9b9416cd23 | 224:34943a3d50d6 |
---|---|
1 \chapter{Adding functionality with extensions} | 1 \chapter{Adding functionality with extensions} |
2 \label{chap:hgext} | 2 \label{chap:hgext} |
3 | 3 |
4 While the core of Mercurial is quite complete from a functionality | |
5 standpoint, it's deliberately shorn of fancy features. This approach | |
6 of preserving simplicity keeps the software easy to deal with for both | |
7 maintainers and users. | |
8 | |
9 However, Mercurial doesn't box you in with an inflexible command set: | |
10 you can add features to it as \emph{extensions} (sometimes known as | |
11 \emph{plugins}). We've already discussed a few of these extensions in | |
12 earlier chapters. | |
13 \begin{itemize} | |
14 \item Section~\ref{sec:tour-merge:fetch} covers the \hgext{fetch} | |
15 extension; this combines pulling new changes and merging them with | |
16 local changes into a single command, \hgcmd{fetch}. | |
17 \item The \hgext{bisect} extension adds an efficient pruning search | |
18 for changes that introduced bugs, and we documented it in | |
19 chapter~\ref{sec:undo:bisect}. | |
20 \item In chapter~\ref{chap:hook}, we covered several extensions that | |
21 are useful for hook-related functionality: \hgext{acl} adds access | |
22 control lists; \hgext{bugzilla} adds integration with the Bugzilla | |
23 bug tracking system; and \hgext{notify} sends notification emails on | |
24 new changes. | |
25 \item The Mercurial Queues patch management extension is so invaluable | |
26 that it merits two chapters and an appendix all to itself. | |
27 Chapter~\ref{chap:mq} covers the basics; | |
28 chapter~\ref{chap:mq-collab} discusses advanced topics; and | |
29 appendix~\ref{chap:mqref} goes into detail on each command. | |
30 \end{itemize} | |
31 | |
32 In this chapter, we'll cover some of the other extensions that are | |
33 available for Mercurial, and briefly touch on some of the machinery | |
34 you'll need to know about if you want to write an extension of your | |
35 own. | |
36 \begin{itemize} | |
37 \item In section~\ref{sec:hgext:inotify}, we'll discuss the | |
38 possibility of \emph{huge} performance improvements using the | |
39 \hgext{inotify} extension. | |
40 \end{itemize} | |
41 | |
42 \section{Improve performance with the \hgext{inotify} extension} | |
43 \label{sec:hgext:inotify} | |
44 | |
45 Are you interested in having some of the most common Mercurial | |
46 operations run as much as a hundred times faster? Read on! | |
47 | |
48 Mercurial has great performance under normal circumstances. For | |
49 example, when you run the \hgcmd{status} command, Mercurial has to | |
50 scan almost every directory and file in your repository so that it can | |
51 display file status. Many other Mercurial commands need to do the | |
52 same work behind the scenes; for example, the \hgcmd{diff} command | |
53 uses the status machinery to avoid doing an expensive comparison | |
54 operation on files that obviously haven't changed. | |
55 | |
56 Because obtaining file status is crucial to good performance, the | |
57 authors of Mercurial have optimised this code to within an inch of its | |
58 life. However, there's no avoiding the fact that when you run | |
59 \hgcmd{status}, Mercurial is going to have to perform at least one | |
60 expensive system call for each managed file to determine whether it's | |
61 changed since the last time Mercurial checked. For a sufficiently | |
62 large repository, this can take a long time. | |
63 | |
64 To put a number on the magnitude of this effect, I created a | |
65 repository containing 150,000 managed files. I timed \hgcmd{status} | |
66 as taking ten seconds to run, even when \emph{none} of those files had | |
67 been modified. | |
68 | |
69 Many modern operating systems contain a file notification facility. | |
70 If a program signs up to an appropriate service, the operating system | |
71 will notify it every time a file of interest is created, modified, or | |
72 deleted. On Linux systems, the kernel component that does this is | |
73 called \texttt{inotify}. | |
74 | |
75 Mercurial's \hgext{inotify} extension talks to the kernel's | |
76 \texttt{inotify} component to optimise \hgcmd{status} commands. The | |
77 extension has two components. A daemon sits in the background and | |
78 receives notifications from the \texttt{inotify} subsystem. It also | |
79 listens for connections from a regular Mercurial command. The | |
80 extension modifies Mercurial's behaviour so that instead of scanning | |
81 the filesystem, it queries the daemon. Since the daemon has perfect | |
82 information about the state of the repository, it can respond with a | |
83 result instantaneously, avoiding the need to scan every directory and | |
84 file in the repository. | |
85 | |
86 Recall the ten seconds that I measured plain Mercurial as taking to | |
87 run \hgcmd{status} on a 150,000 file repository. With the | |
88 \hgext{inotify} extension enabled, the time dropped to 0.1~seconds, a | |
89 factor of \emph{one hundred} faster. | |
90 | |
91 Before we continue, please pay attention to some caveats. | |
92 \begin{itemize} | |
93 \item The \hgext{inotify} extension is Linux-specific. Because it | |
94 interfaces directly to the Linux kernel's \texttt{inotify} | |
95 subsystem, it does not work on other operating systems. | |
96 \item It should work on any Linux distribution that was released after | |
97 early~2005. Older distributions are likely to have a kernel that | |
98 lacks \texttt{inotify}, or a version of \texttt{glibc} that does not | |
99 have the necessary interfacing support. | |
100 \item Not all filesystems are suitable for use with the | |
101 \hgext{inotify} extension. Network filesystems such as NFS are a | |
102 non-starter, for example, particularly if you're running Mercurial | |
103 on several systems, all mounting the same network filesystem. The | |
104 kernel's \texttt{inotify} system has no way of knowing about changes | |
105 made on another system. Most local filesystems (e.g.~ext3, XFS, | |
106 ReiserFS) should work fine. | |
107 \end{itemize} | |
108 | |
109 The \hgext{inotify} extension is not yet shipped with Mercurial as of | |
110 May~2007, so it's a little more involved to set up than other | |
111 extensions. But the performance improvement is worth it! | |
112 | |
113 The extension currently comes in two parts: a set of patches to the | |
114 Mercurial source code, and a library of Python bindings to the | |
115 \texttt{inotify} subsystem. | |
116 \begin{note} | |
117 There are \emph{two} Python \texttt{inotify} binding libraries. One | |
118 of them is called \texttt{pyinotify}, and is packaged by some Linux | |
119 distributions as \texttt{python-inotify}. This is \emph{not} the | |
120 one you'll need, as it is too buggy and inefficient to be practical. | |
121 \end{note} | |
122 To get going, it's best to already have a functioning copy of | |
123 Mercurial installed. | |
124 \begin{note} | |
125 If you follow the instructions below, you'll be \emph{replacing} and | |
126 overwriting any existing installation of Mercurial that you might | |
127 already have, using the latest ``bleeding edge'' Mercurial code. | |
128 Don't say you weren't warned! | |
129 \end{note} | |
130 \begin{enumerate} | |
131 \item Clone the Python \texttt{inotify} binding repository. Build and | |
132 install it. | |
133 \begin{codesample4} | |
134 hg clone http://hg.kublai.com/python/inotify | |
135 cd inotify | |
136 python setup.py build --force | |
137 sudo python setup.py install --skip-build | |
138 \end{codesample4} | |
139 \item Clone the \dirname{crew} Mercurial repository. Clone the | |
140 \hgext{inotify} patch repository so that Mercurial Queues will be | |
141 able to apply patches to your cope of the \dirname{crew} repository. | |
142 \begin{codesample4} | |
143 hg clone http://hg.intevation.org/mercurial/crew | |
144 hg clone crew inotify | |
145 hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches | |
146 \end{codesample4} | |
147 \item Make sure that you have the Mercurial Queues extension, | |
148 \hgext{mq}, enabled. If you've never used MQ, read | |
149 section~\ref{sec:mq:start} to get started quickly. | |
150 \item Go into the \dirname{inotify} repo, and apply all of the | |
151 \hgext{inotify} patches using the \hgopt{qpush}{-a} option to the | |
152 \hgcmd{qpush} command. | |
153 \begin{codesample4} | |
154 cd inotify | |
155 hg qpush -a | |
156 \end{codesample4} | |
157 If you get an error message from \hgcmd{qpush}, you should not | |
158 continue. Instead, ask for help. | |
159 \item Build and install the patched version of Mercurial. | |
160 \begin{codesample4} | |
161 python setup.py build --force | |
162 sudo python setup.py install --skip-build | |
163 \end{codesample4} | |
164 \end{enumerate} | |
165 Once you've build a suitably patched version of Mercurial, all you | |
166 need to do to enable the \hgext{inotify} extension is add an entry to | |
167 your \hgrc. | |
168 \begin{codesample2} | |
169 [extensions] | |
170 inotify = | |
171 \end{codesample2} | |
172 When the \hgext{inotify} extension is enabled, Mercurial will | |
173 automatically and transparently start the status daemon the first time | |
174 you run a command that needs status in a repository. It runs one | |
175 status daemon per repository. | |
176 | |
177 The status daemon is started silently, and runs in the background. If | |
178 you look at a list of running processes after you've enabled the | |
179 \hgext{inotify} extension and run a few commands in different | |
180 repositories, you'll thus see a few \texttt{hg} processes sitting | |
181 around, waiting for updates from the kernel and queries from | |
182 Mercurial. | |
183 | |
184 The first time you run a Mercurial command in a repository when you | |
185 have the \hgext{inotify} extension enabled, it will run with about the | |
186 same performance as a normal Mercurial command. This is because the | |
187 status daemon needs to perform a normal status scan so that it has a | |
188 baseline against which to apply later updates from the kernel. | |
189 However, \emph{every} subsequent command that does any kind of status | |
190 check should be noticeably faster on repositories of even fairly | |
191 modest size. Better yet, the bigger your repository is, the greater a | |
192 performance advantage you'll see. The \hgext{inotify} daemon makes | |
193 status operations almost instantaneous on repositories of all sizes! | |
194 | |
195 If you like, you can manually start a status daemon using the | |
196 \hgcmd{inserve} command. This gives you slightly finer control over | |
197 how the daemon ought to run. This command will of course only be | |
198 available when the \hgext{inotify} extension is enabled. | |
199 | |
200 When you're using the \hgext{inotify} extension, you should notice | |
201 \emph{no difference at all} in Mercurial's behaviour, with the sole | |
202 exception of status-related commands running a whole lot faster than | |
203 they used to. You should specifically expect that commands will not | |
204 print different output; neither should they give different results. | |
205 If either of these situations occurs, please report a bug. | |
4 | 206 |
5 %%% Local Variables: | 207 %%% Local Variables: |
6 %%% mode: latex | 208 %%% mode: latex |
7 %%% TeX-master: "00book" | 209 %%% TeX-master: "00book" |
8 %%% End: | 210 %%% End: |