Mercurial > hgbook
view en/hook.tex @ 40:b2fe9964b21b
More content for hook reference.
author | Bryan O'Sullivan <bos@serpentine.com> |
---|---|
date | Wed, 19 Jul 2006 23:25:30 -0700 |
parents | 576fef93bb49 |
children | d1a3394f8bcf |
line wrap: on
line source
\chapter{Handling repository events with hooks} \label{chap:hook} Mercurial offers a powerful mechanism to let you perform automated actions in response to events that occur in a repository. In some cases, you can even control Mercurial's response to those events. The name Mercurial uses for one of these actions is a \emph{hook}. Hooks are called ``triggers'' in some revision control systems, but the two names refer to the same idea. \section{An overview of hooks in Mercurial} Here is a brief list of the hooks that Mercurial supports. For each hook, we indicate when it is run, and a few examples of common tasks you can use it for. We will revisit each of these hooks in more detail later. \begin{itemize} \item[\small\hook{changegroup}] This is run after a group of changesets has been brought into the repository from elsewhere. In other words, it is run after a \hgcmd{pull} or \hgcmd{push} into a repository, but not after a \hgcmd{commit}. You can use this for performing an action once for the entire group of newly arrived changesets. For example, you could use this hook to send out email notifications, or kick off an automated build or test. \item[\small\hook{commit}] This is run after a new changeset has been created in the local repository, typically using the \hgcmd{commit} command. \item[\small\hook{incoming}] This is run once for each new changeset that is brought into the repository from elsewhere. Notice the difference from \hook{changegroup}, which is run once per \emph{group} of changesets brought in. You can use this for the same purposes as the \hook{changegroup} hook; it's simply more convenient sometimes to run a hook once per group of changesets, while othher times it's handier once per changeset. \item[\small\hook{outgoing}] This is run after a group of changesets has been transmitted from this repository to another. You can use this, for example, to notify subscribers every time changes are cloned or pulled from the repository. \item[\small\hook{prechangegroup}] This is run before starting to bring a group of changesets into the repository. It cannot see the actual changesets, because they have not yet been transmitted. If it fails, the changesets will not be transmitted. You can use this hook to ``lock down'' a repository against incoming changes. \item[\small\hook{precommit}] This is run before starting a commit. It cannot tell what files are included in the commit, or any other information about the commit. If it fails, the commit will not be allowed to start. You can use this to perform a build and require it to complete successfully before a commit can proceed, or automatically enforce a requirement that modified files pass your coding style guidelines. \item[\small\hook{preoutgoing}] This is run before starting to transmit a group of changesets from this repository. You can use this to lock a repository against clones or pulls from remote clients. \item[\small\hook{pretag}] This is run before creating a tag. If it fails, the tag will not be created. You can use this to enforce a uniform tag naming convention. \item[\small\hook{pretxnchangegroup}] This is run after a group of changesets has been brought into the local repository from another, but before the transaction completes that will make the changes permanent in the repository. If it fails, the transaction will be rolled back and the changes will disappear from the local repository. You can use this to automatically check newly arrived changes and, for example, roll them back if the group as a whole does not build or pass your test suite. \item[\small\hook{pretxncommit}] This is run after a new changeset has been created in the local repository, but before the transaction completes that will make it permanent. Unlike the \hook{precommit} hook, this hook can see which changes are present in the changeset, and it can also see all other changeset metadata, such as the commit message. You can use this to require that a commit message follows your local conventions, or that a changeset builds cleanly. \item[\small\hook{preupdate}] This is run before starting an update or merge of the working directory. \item[\small\hook{tag}] This is run after a tag is created. \item[\small\hook{update}] This is run after an update or merge of the working directory has finished. \end{itemize} Each of the hooks with a ``\texttt{pre}'' prefix has the ability to \emph{control} an activity. If the hook succeeds, the activity may proceed; if it fails, the activity is either not permitted or undone, depending on the hook. \section{Hooks and security} \subsection{Hooks are run with your privileges} When you run a Mercurial command in a repository, and the command causes a hook to run, that hook runs on your system, under your user account, with your privilege level. Since hooks are arbitrary pieces of executable code, you should treat them with an appropriate level of suspicion. Do not install a hook unless you are confident that you know who created it and what it does. In some cases, you may be exposed to hooks that you did not install yourself. If you work with Mercurial on an unfamiliar system, Mercurial will run hooks defined in that system's global \hgrc\ file. If you are working with a repository owned by another user, Mercurial will run hooks defined in that repository. For example, if you \hgcmd{pull} from that repository, and its \sfilename{.hg/hgrc} defines a local \hook{outgoing} hook, that hook will run under your user account, even though you don't own that repository. \begin{note} This only applies if you are pulling from a repository on a local or network filesystem. If you're pulling over http or ssh, any \hook{outgoing} hook will run under the account of the server process, on the server. \end{note} XXX To see what hooks are defined in a repository, use the \hgcmdargs{config}{hooks} command. If you are working in one repository, but talking to another that you do not own (e.g.~using \hgcmd{pull} or \hgcmd{incoming}), remember that it is the other repository's hooks you should be checking, not your own. \subsection{Hooks do not propagate} In Mercurial, hooks are not revision controlled, and do not propagate when you clone, or pull from, a repository. The reason for this is simple: a hook is a completely arbitrary piece of executable code. It runs under your user identity, with your privilege level, on your machine. It would be extremely reckless for any distributed revision control system to implement revision-controlled hooks, as this would offer an easily exploitable way to subvert the accounts of users of the revision control system. Since Mercurial does not propagate hooks, if you are collaborating with other people on a common project, you should not assume that they are using the same Mercurial hooks as you are, or that theirs are correctly configured. You should document the hooks you expect people to use. In a corporate intranet, this is somewhat easier to control, as you can for example provide a ``standard'' installation of Mercurial on an NFS filesystem, and use a site-wide \hgrc\ file to define hooks that all users will see. However, this too has its limits; see below. \subsection{Hooks can be overridden} Mercurial allows you to override a hook definition by redefining the hook. You can disable it by setting its value to the empty string, or change its behaviour as you wish. If you deploy a system-~or site-wide \hgrc\ file that defines some hooks, you should thus understand that your users can disable or override those hooks. \subsection{Ensuring that critical hooks are run} Sometimes you may want to enforce a policy that you do not want others to be able to work around. For example, you may have a requirement that every changeset must pass a rigorous set of tests. Defining this requirement via a hook in a site-wide \hgrc\ won't work for remote users on laptops, and of course local users can subvert it at will by overriding the hook. Instead, you can set up your policies for use of Mercurial so that people are expected to propagate changes through a well-known ``canonical'' server that you have locked down and configured appropriately. One way to do this is via a combination of social engineering and technology. Set up a restricted-access account; users can push changes over the network to repositories managed by this account, but they cannot log into the account and run normal shell commands. In this scenario, a user can commit a changeset that contains any old garbage they want. When someone pushes a changeset to the server that everyone pulls from, the server will test the changeset before it accepts it as permanent, and reject it if it fails to pass the test suite. If people only pull changes from this filtering server, it will serve to ensure that all changes that people pull have been automatically vetted. \section{A short tutorial on using hooks} \label{sec:hook:simple} It is easy to write a Mercurial hook. Let's start with a hook that runs when you finish a \hgcmd{commit}, and simply prints the hash of the changeset you just created. The hook is called \hook{commit}. \begin{figure}[ht] \interaction{hook.simple.init} \caption{A simple hook that runs when a changeset is committed} \label{ex:hook:init} \end{figure} All hooks follow the pattern in example~\ref{ex:hook:init}. You add an entry to the \rcsection{hooks} section of your \hgrc\. On the left is the name of the event to trigger on; on the right is the action to take. As you can see, you can run an arbitrary shell command in a hook. Mercurial passes extra information to the hook using environment variables (look for \envar{HG\_NODE} in the example). \subsection{Performing multiple actions per event} Quite often, you will want to define more than one hook for a particular kind of event, as shown in example~\ref{ex:hook:ext}. Mercurial lets you do this by adding an \emph{extension} to the end of a hook's name. You extend a hook's name by giving the name of the hook, followed by a full stop (the ``\texttt{.}'' character), followed by some more text of your choosing. For example, Mercurial will run both \texttt{commit.foo} and \texttt{commit.bar} when the \texttt{commit} event occurs. \begin{figure}[ht] \interaction{hook.simple.ext} \caption{Defining a second \hook{commit} hook} \label{ex:hook:ext} \end{figure} To give a well-defined order of execution when there are multiple hooks defined for an event, Mercurial sorts hooks by extension, and executes the hook commands in this sorted order. In the above example, it will execute \texttt{commit.bar} before \texttt{commit.foo}, and \texttt{commit} before both. It is a good idea to use a somewhat descriptive extension when you define a new hook. This will help you to remember what the hook was for. If the hook fails, you'll get an error message that contains the hook name and extension, so using a descriptive extension could give you an immediate hint as to why the hook failed (see section~\ref{sec:hook:perm} for an example). \subsection{Controlling whether an activity can proceed} \label{sec:hook:perm} In our earlier examples, we used the \hook{commit} hook, which is run after a commit has completed. This is one of several Mercurial hooks that run after an activity finishes. Such hooks have no way of influencing the activity itself. Mercurial defines a number of events that occur before an activity starts; or after it starts, but before it finishes. Hooks that trigger on these events have the added ability to choose whether the activity can continue, or will abort. The \hook{pretxncommit} hook runs after a commit has all but completed. In other words, the metadata representing the changeset has been written out to disk, but the transaction has not yet been allowed to complete. The \hook{pretxncommit} hook has the ability to decide whether the transaction can complete, or must be rolled back. If the \hook{pretxncommit} hook exits with a status code of zero, the transaction is allowed to complete; the commit finishes; and the \hook{commit} hook is run. If the \hook{pretxncommit} hook exits with a non-zero status code, the transaction is rolled back; the metadata representing the changeset is erased; and the \hook{commit} hook is not run. \begin{figure}[ht] \interaction{hook.simple.pretxncommit} \caption{Using the \hook{pretxncommit} hook to control commits} \label{ex:hook:pretxncommit} \end{figure} The hook in example~\ref{ex:hook:pretxncommit} checks that a commit comment contains a bug ID. If it does, the commit can complete. If not, the commit is rolled back. \section{Writing your own hooks} When you are writing a hook, you might find it useful to run Mercurial either with the \hggopt{-v} option, or the \rcitem{ui}{verbose} config item set to ``true''. When you do so, Mercurial will print a message before it calls each hook. \subsection{Choosing how your hook should run} \label{sec:hook:lang} You can write a hook either as a normal program---typically a shell script---or as a Python function that is executed within the Mercurial process. Writing a hook as an external program has the advantage that it requires no knowledge of Mercurial's internals. You can call normal Mercurial commands to get any added information you need. The trade-off is that external hooks are slower than in-process hooks. An in-process Python hook has complete access to the Mercurial API, and does not ``shell out'' to another process, so it is inherently faster than an external hook. It is also easier to obtain much of the information that a hook requires by using the Mercurial API than by running Mercurial commands. If you are comfortable with Python, or require high performance, writing your hooks in Python may be a good choice. However, when you have a straightforward hook to write and you don't need to care about performance (probably the majority of hooks), a shell script is perfectly fine. \subsection{Hook parameters} \label{sec:hook:param} Mercurial calls each hook with a set of well-defined parameters. In Python, a parameter is passed as a keyword argument to your hook function. For an external program, a parameter is passed as an environment variable. Whether your hook is written in Python or as a shell script, the hook-specific parameter names and values will be the same. A boolean parameter will be represented as a boolean value in Python, but as the number 1 (for ``true'') or 0 (for ``false'') as an environment variable for an external hook. If a hook parameter is named \texttt{foo}, the keyword argument for a Python hook will also be named \texttt{foo} Python, while the environment variable for an external hook will be named \texttt{HG\_FOO}. \subsection{Hook return values and activity control} A hook that executes successfully must exit with a status of zero if external, or return boolean ``false'' if in-process. Failure is indicated with a non-zero exit status from an external hook, or an in-process hook returning boolean ``true''. If an in-process hook raises an exception, the hook is considered to have failed. For a hook that controls whether an activity can proceed, zero/false means ``allow'', while non-zero/true/exception means ``deny''. \subsection{Writing an external hook} When you define an external hook in your \hgrc\ and the hook is run, its value is passed to your shell, which interprets it. This means that you can use normal shell constructs in the body of the hook. An executable hook is always run with its current directory set to a repository's root directory. Each hook parameter is passed in as an environment variable; the name is upper-cased, and prefixed with the string ``\texttt{HG\_}''. With the exception of hook parameters, Mercurial does not set or modify any environment variables when running a hook. This is useful to remember if you are writing a site-wide hook that may be run by a number of different users with differing environment variables set. In multi-user situations, you should not rely on environment variables being set to the values you have in your environment when testing the hook. \subsection{Telling Mercurial to use an in-process hook} The \hgrc\ syntax for defining an in-process hook is slightly different than for an executable hook. The value of the hook must start with the text ``\texttt{python:}'', and continue with the fully-qualified name of a callable object to use as the hook's value. The module in which a hook lives is automatically imported when a hook is run. So long as you have the module name and \envar{PYTHONPATH} right, it should ``just work''. The following \hgrc\ example snippet illustrates the syntax and meaning of the notions we just described. \begin{codesample2} [hooks] commit.example = python:mymodule.submodule.myhook \end{codesample2} When Mercurial runs the \texttt{commit.example} hook, it imports \texttt{mymodule.submodule}, looks for the callable object named \texttt{myhook}, and calls it. \subsection{Writing an in-process hook} The simplest in-process hook does nothing, but illustrates the basic shape of the hook API: \begin{codesample2} def myhook(ui, repo, **kwargs): pass \end{codesample2} The first argument to a Python hook is always a \pymodclass{mercurial.ui}{ui} object. The second is a repository object; at the moment, it is always an instance of \pymodclass{mercurial.localrepo}{localrepository}. Following these two arguments are other keyword arguments. Which ones are passed in depends on the hook being called, but a hook can ignore arguments it doesn't care about by dropping them into a keyword argument dict, as with \texttt{**kwargs} above. \section{Hook reference} \subsection{In-process hook execution} An in-process hook is called with arguments of the following form: \begin{codesample2} def myhook(ui, repo, **kwargs): pass \end{codesample2} The \texttt{ui} parameter is a \pymodclass{mercurial.ui}{ui} object. The \texttt{repo} parameter is a \pymodclass{mercurial.localrepo}{localrepository} object. The names and values of the \texttt{**kwargs} parameters depend on the hook being invoked, with the following common features: \begin{itemize} \item If a parameter is named \texttt{node} or \texttt{parent\emph{N}}, it will contain a hexadecimal changeset ID. The empty string is used to represent ``null changeset ID'' instead of a string of zeroes. \item Boolean-valued parameters are represented as Python \texttt{bool} objects. \end{itemize} An in-process hook is called without a change to the process's working directory (unlike external hooks, which are run in the root of the repository). It must not change the process's working directory. If it were to do so, it would probably cause calls to the Mercurial API, or operations after the hook finishes, to fail. If a hook returns a boolean ``false'' value, it is considered to have succeeded. If it returns a boolean ``true'' value or raises an exception, it is considered to have failed. \subsection{External hook execution} An external hook is passed to the user's shell for execution, so features of that shell, such as variable substitution and command redirection, are available. The hook is run in the root directory of the repository. Hook parameters are passed to the hook as environment variables. Each environment variable's name is converted in upper case and prefixed with the string ``\texttt{HG\_}''. For example, if the name of a parameter is ``\texttt{node}'', the name of the environment variable representing that parameter will be ``\texttt{HG\_NODE}''. A boolean parameter is represented as the string ``\texttt{1}'' for ``true'', ``\texttt{0}'' for ``false''. If an environment variable is named \envar{HG\_NODE}, \envar{HG\_PARENT1} or \envar{HG\_PARENT2}, it contains a changeset ID represented as a hexadecimal string. The empty string is used to represent ``null changeset ID'' instead of a string of zeroes. If a hook exits with a status of zero, it is considered to have succeeded. If it exits with a non-zero status, it is considered to have failed. \subsection{The \hook{changegroup} hook} \label{sec:hook:changegroup} This hook is run after a group of pre-existing changesets has been added to the repository, for example via a \hgcmd{pull} or \hgcmd{unbundle}. This hook is run once per operation that added one or more changesets. Parameters to this hook: \begin{itemize} \item[\texttt{node}] A changeset ID. The changeset ID of the first changeset in the group that was added. All changesets between this and \index{tags!\texttt{tip}}\texttt{tip}, inclusive, were added by a single \hgcmd{pull}, \hgcmd{push} or \hgcmd{unbundle}. \end{itemize} See also: \hook{incoming} (section~\ref{sec:hook:incoming}), \hook{prechangegroup} (section~\ref{sec:hook:prechangegroup}), \hook{pretxnchangegroup} (section~\ref{sec:hook:pretxnchangegroup}) \subsection{The \hook{commit} hook} \label{sec:hook:commit} This hook is run after a new changeset has been created. Parameters to this hook: \begin{itemize} \item[\texttt{node}] A changeset ID. The changeset ID of the newly committed changeset. \item[\texttt{parent1}] A changeset ID. The changeset ID of the first parent of the newly committed changeset. \item[\texttt{parent2}] A changeset ID. The changeset ID of the second parent of the newly committed changeset. \end{itemize} See also: \hook{precommit} (section~\ref{sec:hook:precommit}), \hook{pretxncommit} (section~\ref{sec:hook:pretxncommit}) \subsection{The \hook{incoming} hook} \label{sec:hook:incoming} This hook is run after a pre-existing changeset has been added to the repository, for example via a \hgcmd{push}. If a group of changesets was added in a single operation, this hook is called once for each added changeset. Parameters to this hook: \begin{itemize} \item[\texttt{node}] A changeset ID. The ID of the newly added changeset. \end{itemize} See also: \hook{changegroup} (section~\ref{sec:hook:changegroup}) \hook{prechangegroup} (section~\ref{sec:hook:prechangegroup}), \hook{pretxnchangegroup} (section~\ref{sec:hook:pretxnchangegroup}) \subsection{The \hook{outgoing} hook} \label{sec:hook:outgoing} This hook is run after a group of changesets has been propagated out of this repository, for example by a \hgcmd{push} or \hgcmd{bundle} command. Parameters to this hook: \begin{itemize} \item[\texttt{node}] A changeset ID. The changeset ID of the first changeset of the group that was sent. \item[\texttt{source}] A string. The source of the of the operation. If a remote client pulled changes from this repository, \texttt{source} will be \texttt{serve}. If the client that obtained changes from this repository was local, \texttt{source} will be \texttt{bundle}, \texttt{pull}, or \texttt{push}, depending on the operation the client performed. \end{itemize} See also: \hook{preoutgoing} (section~\ref{sec:hook:preoutgoing}) \subsection{The \hook{prechangegroup} hook} \label{sec:hook:prechangegroup} This hook is not passed any parameters. See also: \hook{changegroup} (section~\ref{sec:hook:changegroup}), \hook{incoming} (section~\ref{sec:hook:incoming}), , \hook{pretxnchangegroup} (section~\ref{sec:hook:pretxnchangegroup}) \subsection{The \hook{precommit} hook} \label{sec:hook:precommit} This hook is invoked before Mercurial has obtained any of the metadata for the commit, such as the commit message or date. Parameters to this hook: \begin{itemize} \item[\texttt{parent1}] A changeset ID. The changeset ID of the first parent of the working directory. \item[\texttt{parent2}] A changeset ID. The changeset ID of the second parent of the working directory. \end{itemize} If the commit proceeds, the parents of the working directory will become the parents of the new changeset. See also: \hook{commit} (section~\ref{sec:hook:commit}), \hook{pretxncommit} (section~\ref{sec:hook:pretxncommit}) \subsection{The \hook{preoutgoing} hook} \label{sec:hook:preoutgoing} This hook is invoked before Mercurial knows the identities of the changesets to be transmitted. Parameters to this hook: \begin{itemize} \item[\texttt{source}] A string. The source of the operation that is attempting to obtain changes from this repository. See the documentation for the \texttt{source} parameter to the \hook{outgoing} hook, in section~\ref{sec:hook:outgoing}, for possible values of this parameter.. \end{itemize} See also: \hook{outgoing} (section~\ref{sec:hook:outgoing}) \subsection{The \hook{pretag} hook} \label{sec:hook:pretag} Parameters to this hook: \begin{itemize} \item[\texttt{local}] A boolean. Whether the tag is local to this repository instance (i.e.~stored in \sfilename{.hg/tags}) or managed by Mercurial (stored in \sfilename{.hgtags}). \item[\texttt{node}] A changeset ID. The ID of the changeset to be tagged. \item[\texttt{tag}] A string. The name of the tag to be created. \end{itemize} If the tag to be created is revision-controlled, the \hook{precommit} and \hook{pretxncommit} hooks (sections~\ref{sec:hook:commit} and~\ref{sec:hook:pretxncommit}) will also be run. See also: \hook{tag} (section~\ref{sec:hook:tag}) \subsection{The \hook{pretxnchangegroup} hook} \label{sec:hook:pretxnchangegroup} Parameters to this hook are the same as for the \hook{changegroup} hook; see section~\ref{sec:hook:changegroup} for details. See also: \hook{changegroup} (section~\ref{sec:hook:changegroup}), \hook{incoming} (section~\ref{sec:hook:incoming}), \hook{prechangegroup} (section~\ref{sec:hook:prechangegroup}) \subsection{The \hook{pretxncommit} hook} \label{sec:hook:pretxncommit} Parameters to this hook are the same as for the \hook{commit} hook; see section~\ref{sec:hook:commit} for details. See also: \hook{precommit} (section~\ref{sec:hook:precommit}) \subsection{The \hook{preupdate} hook} \label{sec:hook:preupdate} Parameters to this hook: \begin{itemize} \item[\texttt{parent1}] A changeset ID. The ID of the parent that the working directory is to be updated to. If the working directory is being merged, it will not change this parent. \item[\texttt{parent2}] A changeset ID. Only set if the working directory is being merged. The ID of the revision that the working directory is being merged with. \end{itemize} See also: \hook{update} (section~\ref{sec:hook:update}) \subsection{The \hook{tag} hook} \label{sec:hook:tag} Parameters to this hook are the same as for the \hook{pretag} hook; see section~\ref{sec:hook:pretag} for details. If the created tag is revision-controlled, the \hook{commit} hook (section~\ref{sec:hook:commit}) will also be run. See also: \hook{pretag} (section~\ref{sec:hook:pretag}) \subsection{The \hook{update} hook} \label{sec:hook:update} \begin{itemize} \item[\texttt{error}] A boolean. Indicates whether the update or merge completed successfully. \item[\texttt{parent1}] A changeset ID. The ID of the parent that the working directory was updated to. If the working directory was merged, it will not have changed this parent. \item[\texttt{parent2}] A changeset ID. Only set if the working directory was merged. The ID of the revision that the working directory was merged with. \end{itemize} See also: \hook{preupdate} (section~\ref{sec:hook:preupdate}) %%% Local Variables: %%% mode: latex %%% TeX-master: "00book" %%% End: