diff en/collab.tex @ 159:7355af913937

First steps on collaboration chapter.
author Bryan O'Sullivan <bos@serpentine.com>
date Thu, 22 Mar 2007 23:03:11 -0700
parents
children 5fc4a45c069f
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/en/collab.tex	Thu Mar 22 23:03:11 2007 -0700
@@ -0,0 +1,236 @@
+\chapter{Collaborating with other people}
+\label{cha:collab}
+
+As a completely decentralised tool, Mercurial doesn't impose any
+policy on how people ought to work with each other.  However, if
+you're new to distributed revision control, it helps to have some
+tools and examples in mind when you're thinking about possible
+workflow models.
+
+\section{Collaboration models}
+
+With a suitably flexible tool, making decisions about workflow is much
+more of a social engineering challenge than a technical one.
+Mercurial imposes few limitations on how you can structure the flow of
+work in a project, so it's up to you and your group to set up and live
+with a model that matches your own particular needs.
+
+\subsection{Factors to keep in mind}
+
+The most important aspect of any model that you must keep in mind is
+how well it matches the needs and capabilities of the people who will
+be using it.  This might seem self-evident; even so, you still can't
+afford to forget it for a moment.
+
+I once put together a workflow model that seemed to make perfect sense
+to me, but that caused a considerable amount of consternation and
+strife within my development team.  In spite of my attempts to explain
+why we needed a complex set of branches, and how changes ought to flow
+between them, a few team members revolted.  Even though they were
+smart people, they didn't want to pay attention to the constraints we
+were operating under, or face the consequences of those constraints in
+the details of the model that I was advocating.
+
+Don't sweep foreseeable social or technical problems under the rug.
+Whatever scheme you put into effect, you should plan for mistakes and
+problem scenarios.  Consider adding automated machinery to prevent, or
+quickly recover from, trouble that you can anticipate.  As an example,
+if you intend to have a branch with not-for-release changes in it,
+you'd do well to think early about the possibility that someone might
+accidentally merge those changes into a release branch.  You could
+avoid this particular problem by writing a hook that prevents changes
+from being merged from an inappropriate branch.
+
+\subsection{Informal anarchy}
+
+I wouldn't suggest an ``anything goes'' approach as something
+sustainable, but it's a model that's easy to grasp, and it works
+perfectly well in a few unusual situations.
+
+As one example, many projects have a loose-knit group of collaborators
+who rarely physically meet each other.  Some groups like to overcome
+the isolation of working at a distance by organising occasional
+``sprints''.  In a sprint, a number of people get together in a single
+location (a company's conference room, a hotel meeting room, that kind
+of place) and spend several days more or less locked in there, hacking
+intensely on a handful of projects.
+
+A sprint is the perfect place to use the \hgcmd{serve} command, since
+\hgcmd{serve} does not requires any fancy server infrastructure.  You
+can get started with \hgcmd{serve} in moments, by reading
+section~\ref{sec:collab:serve} below.  Then simply tell the person
+next to you that you're running a server, send the URL to them in an
+instant message, and you immediately have a quick-turnaround way to
+work together.  They can type your URL into their web browser and
+quickly review your changes; or they can pull a bugfix from you and
+verify it; or they can clone a branch containing a new feature and try
+it out.
+
+The charm, and the problem, with doing things in an ad hoc fashion
+like this is that only people who know about your changes, and where
+they are, can see them.  Such an informal approach simply doesn't
+scale beyond a handful people, because each individual needs to know
+about $n$ different repositories to pull from.
+
+\subsection{A single central repository}
+
+For smaller projects, migrating from a centralised revision control
+tool, perhaps the easiest way to get started is to have changes flow
+through a single shared central repository.  This is also the
+most common ``building block'' for more ambitious workflow schemes.
+
+Contributors start by cloning a copy of this repository.  They can
+pull changes from it whenever they need to, and some (perhaps all)
+developers have permission to push a change back when they're ready
+for other people to see it.
+
+Under this model, it can still sometimes make sense for people to pull
+changes directly from each other, without going through the central
+repository.  Consider a case in which I have a tentative bug fix, but
+I am worried that if I were to publish it to the central repository,
+it might subsequently break everyone else's trees as they pull it.  To
+reduce the potential for damage, I can ask you to clone my repository
+into a temporary repository of your own and test it.  This lets us put
+off publishing the potentially unsafe change until it has had a little
+testing.
+
+In this kind of scenario, people usually use the \command{ssh}
+protocol to securely push changes to the central repository, as
+documented in section~\ref{sec:collab:ssh}.  It's also usual to
+publish a read-only copy of the repository over HTTP using CGI, as in
+section~\ref{sec:collab:cgi}.  Publishing over HTTP satisfies the
+needs of people who don't have push access, and those who want to use
+web browsers to browse the repository's history.
+
+\subsection{The Linux kernel model}
+
+The development of the Linux kernel has a shallow hierarchical
+structure, surrounded by a cloud of apparent chaos.  Because most
+Linux developers use \command{git}, a distributed revision control
+tool with capabilities similar to Mercurial, it's useful to describe
+the way work flows in that environment; if you like the ideas, the
+approach translates well across tools.
+
+At the center of the community sits Linus Torvalds, the creator of
+Linux.  He publishes a single source repository that is considered the
+``authoritative'' current tree by the entire developer community.
+Anyone can clone Linus's tree, but he is very choosy about whose trees
+he pulls from.
+
+Linus has a number of ``trusted lieutenants''.  As a general rule, he
+pulls whatever changes they publish, in most cases without even
+reviewing those changes.  Some of those lieutenants are generally
+agreed to be ``maintainers'', responsible for specific subsystems
+within the kernel.  If a random kernel hacker wants to make a change
+to a subsystem that they want to end up in Linus's tree, they must
+find out who the subsystem's maintainer is, and ask that maintainer to
+take their change.  If the maintainer reviews their changes and agrees
+to take them, they'll pass them along to Linus in due course.
+
+Individual lieutenants have their own approaches to reviewing,
+accepting, and publishing changes; and for deciding when to feed them
+to Linus.  In addition, there are several well known branches that
+people use for different purposes.  For example, a few people maintain
+``stable'' repositories of older versions of the kernel, to which they
+apply critical fixes as needed.
+
+This model has two notable features.  The first is that it's ``pull
+only''.  You have to ask, convince, or beg another developer to take a
+change from you, because there are no shared trees, and there's no way
+to push changes into a tree that someone else controls.
+
+The second is that it's based on reputation and acclaim.  If you're an
+unknown, Linus will probably ignore changes from you without even
+responding.  But a subsystem maintainer will probably review them, and
+will likely take them if they pass their criteria for suitability.
+The more ``good'' changes you contribute to a maintainer, the more
+likely they are to trust your judgment and accept your changes.  If
+you're well-known and maintain a long-lived branch for something Linus
+hasn't yet accepted, people with similar interests may pull your
+changes regularly to keep up with your work.
+
+Reputation and acclaim don't necessarily cross subsystem or ``people''
+boundaries.  If you're a respected but specialised storage hacker, and
+you try to fix a networking bug, that change will receive a level of
+scrutiny from a network maintainer comparable to a change from a
+complete stranger.
+
+To people who come from more orderly project backgrounds, the
+comparatively chaotic Linux kernel development process often seems
+completely insane.  It's subject to the whims of individuals; people
+make sweeping changes whenever they deem it appropriate; and the pace
+of development is astounding.  And yet Linux is a highly successful,
+well-regarded piece of software.
+
+\section{The technical side of sharing}
+
+\subsection{Informal sharing with \hgcmd{serve}}
+\label{sec:collab:serve}
+
+Mercurial's \hgcmd{serve} command is wonderfully suited to small,
+tight-knit, and fast-paced group environments.  It also provides a
+great way to get a feel for using Mercurial commands over a network.
+
+Run \hgcmd{serve} inside a repository, and in under a second it will
+bring up a specialised HTTP server; this will accept connections from
+any client, and serve up data for that repository until you terminate
+it.  Anyone who knows the URL of the server you just started, and can
+talk to your computer over the network, can then use a web browser or
+Mercurial to read data from that repository.  A URL for a
+\hgcmd{serve} instance running on a laptop is likely to look something
+like \Verb|http://my-laptop.local:8000/|.
+
+The \hgcmd{serve} command is \emph{not} a general-purpose web server.
+It can do only two things:
+\begin{itemize}
+\item Allow people to browse the history of the repository it's
+  serving, from their normal web browsers.
+\item Speak Mercurial's wire protocol, so that people can
+  \hgcmd{clone} or \hgcmd{pull} changes from that repository.
+\end{itemize}
+In particular, \hgcmd{serve} won't allow remote users to \emph{modify}
+your repository.  It's intended for read-only use.
+
+If you're getting started with Mercurial, there's nothing to prevent
+you from using \hgcmd{serve} to serve up a repository on your own
+computer, then use commands like \hgcmd{clone}, \hgcmd{incoming}, and
+so on to talk to that server as if the repository was hosted remotely.
+This can help you to quickly get acquainted with using commands on
+network-hosted repositories.
+
+\subsubsection{A few things to keep in mind}
+
+Because it provides unauthenticated read access to all clients, you
+should only use \hgcmd{serve} in an environment where you either don't
+care, or have complete control over, who can access your network and
+pull data from your repository.
+
+The \hgcmd{serve} command knows nothing about any firewall software
+you might have installed on your system or network.  It cannot detect
+or control your firewall software.  If other people are unable to talk
+to a running \hgcmd{serve} instance, the second thing you should do
+(\emph{after} you make sure that they're using the correct URL) is
+check your firewall configuration.
+
+By default, \hgcmd{serve} listens for incoming connections on
+port~8000.  If another process is already listening on the port you
+want to use, you can specify a different port to listen on using the
+\hgopt{serve}{-p} option.
+
+Normally, when \hgcmd{serve} starts, it prints no output, which can be
+a bit unnerving.  If you'd like to confirm that it is indeed running
+correctly, and find out what URL you should send to your
+collaborators, start it with the \hggopt{-v} option.
+
+\subsection{Using \command{ssh} as a tunnel}
+\label{sec:collab:ssh}
+
+\subsection{Serving HTTP with a CGI script}
+\label{sec:collab:cgi}
+
+
+
+%%% Local Variables: 
+%%% mode: latex
+%%% TeX-master: "00book"
+%%% End: