Mercurial > hgbook

\chapter{Mercurial in daily use}
\label{chap:daily}

\section{Telling Mercurial which files to track}

Mercurial does not work with files in your repository unless you tell
it to manage them.  The \hgcmd{status} command will tell you which
files Mercurial doesn't know about; it uses a ``\texttt{?}'' to
display such files.

To tell Mercurial to track a file, use the \hgcmd{add} command.  Once
you have added a file, the entry in the output of \hgcmd{status} for
that file changes from ``\texttt{?}'' to ``\texttt{A}''.
\interaction{daily.files.add}

After you run a \hgcmd{commit}, the files that you added before the
commit will no longer be listed in the output of \hgcmd{status}.  The
reason for this is that \hgcmd{status} only tells you about
``interesting'' files---those that you have modified or told Mercurial
to do something with---by default.  If you have a repository that
contains thousands of files, you will rarely want to know about files
that Mercurial is tracking, but that have not changed.  (You can still
get this information; we'll return to this later.)

Once you add a file, Mercurial doesn't do anything with it
immediately.  Instead, it will take a snapshot of the file's state the
next time you perform a commit.  It will then continue to track the
changes you make to the file every time you commit, until you remove
the file.

\subsection{Explicit versus implicit file naming}

A useful behaviour that Mercurial has is that if you pass the name of
a directory to a command, every Mercurial command will treat this as
``I want to operate on every file in this directory and its
subdirectories''.
\interaction{daily.files.add-dir}
Notice in this example that Mercurial printed the names of the files
it added, whereas it didn't do so when we added the file named
\filename{a} in the earlier example.

What's going on is that in the former case, we explicitly named the
file to add on the command line, so the assumption that Mercurial
makes in such cases is that you know what you were doing, and it
doesn't print any output.

However, when we \emph{imply} the names of files by giving the name of
a directory, Mercurial takes the extra step of printing the name of
each file that it does something with.  This makes it more clear what
is happening, and reduces the likelihood of a silent and nasty
surprise.  This behaviour is common to most Mercurial commands.

\subsection{Aside: Mercurial tracks files, not directories}

Mercurial does not track directory information.  Instead, it tracks
the path to a file.  Before creating a file, it first creates any
missing directory components of the path.  After it deletes a file, it
then deletes any empty directories that were in the deleted file's
path.  This sounds like a trivial distinction, but it has one minor
practical consequence: it is not possible to represent a completely
empty directory in Mercurial.

Empty directories are rarely useful, and there are unintrusive
workarounds that you can use to achieve an appropriate effect.  The
developers of Mercurial thus felt that the complexity that would be
required to manage empty directories was not worth the limited benefit
this feature would bring.

If you need an empty directory in your repository, there are a few
ways to achieve this. One is to create a directory, then \hgcmd{add} a
``hidden'' file to that directory.  On Unix-like systems, any file
name that begins with a period (``\texttt{.}'') is treated as hidden
by most commands and GUI tools.  This approach is illustrated in
figure~\ref{ex:daily:hidden}.

\begin{figure}[ht]
  \interaction{daily.files.hidden}
  \caption{Simulating an empty directory using a hidden file}
  \label{ex:daily:hidden}
\end{figure}

Another way to tackle a need for an empty directory is to simply
create one in your automated build scripts before they will need it.

\section{How to stop tracking a file}

Once you decide that a file no longer belongs in your repository, use
the \hgcmd{remove} command; this deletes the file, and tells Mercurial
to stop tracking it.  A removed file is represented in the output of
\hgcmd{status} with a ``\texttt{R}''.
\interaction{daily.files.remove}

After you \hgcmd{remove} a file, Mercurial will no longer track
changes to that file, even if you recreate a file with the same name
in your working directory.  If you do recreate a file with the same
name and want Mercurial to track the new file, simply \hgcmd{add} it.
Mercurial will know that the newly added file is not related to the
old file of the same name.

\subsection{Removing a file does not affect its history}

It is important to understand that removing a file has only two
effects.
\begin{itemize}
\item It removes the current version of the file from the working
  directory.
\item It stops Mercurial from tracking changes to the file, from the
  time of the next commit.
\end{itemize}
Removing a file \emph{does not} in any way alter the \emph{history} of
the file.

If you update the working directory to a changeset in which a file
that you have removed was still tracked, it will reappear in the
working directory, with the contents it had when you committed that
changeset.  If you then update the working directory to a later
changeset, in which the file had been removed, Mercurial will once
again remove the file from the working directory.

\subsection{Missing files}

Mercurial considers a file that you have deleted, but not used
\hgcmd{remove} to delete, to be \emph{missing}.  A missing file is
represented with ``\texttt{!}'' in the output of \hgcmd{status}.
Mercurial commands will not generally do anything with missing files.
\interaction{daily.files.missing}

If your repository contains a file that \hgcmd{status} reports as
missing, and you want the file to stay gone, you can run
\hgcmdargs{remove}{\hgopt{remove}{--after}} at any time later on, to
tell Mercurial that you really did mean to remove the file.
\interaction{daily.files.remove-after}

On the other hand, if you deleted the missing file by accident, use
\hgcmdargs{revert}{\emph{filename}} to recover the file.  It will
reappear, in unmodified form.
\interaction{daily.files.recover-missing}

\subsection{Aside: why tell Mercurial explicitly to
  remove a file?}

You might wonder why Mercurial requires you to explicitly tell it that
you are deleting a file.  Early during the development of Mercurial,
it let you delete a file however you pleased; Mercurial would notice
the absence of the file automatically when you next ran a
\hgcmd{commit}, and stop tracking the file.  In practice, this made it
too easy to accidentally remove a file without noticing.

\subsection{Useful shorthand---adding and removing files
  in one step}

Mercurial offers a combination command, \hgcmd{addremove}, that adds
untracked files and marks missing files as removed.
\interaction{daily.files.addremove}
The \hgcmd{commit} command also provides a \hgopt{commit}{-A} option
that performs this same add-and-remove, immediately followed by a
commit.
\interaction{daily.files.commit-addremove}

\section{Copying files}

Mercurial provides a \hgcmd{copy} command that lets you make a new
copy of a file.  When you copy a file using this command, Mercurial
makes a record of the fact that the new file is a copy of the original
file.  It treats these copied files specially when you merge your work
with someone else's.

\subsection{The results of copying during a merge}

What happens during a merge is that changes ``follow'' a copy.  To
best illustrate what this means, let's create an example.  We'll start
with the usual tiny repository that contains a single file.
\interaction{daily.copy.init}
We need to do some work in parallel, so that we'll have something to
merge.  So let's clone our repository.
\interaction{daily.copy.clone}
Back in our initial repository, let's use the \hgcmd{copy} command to
make a copy of the first file we created.
\interaction{daily.copy.copy}

If we look at the output of the \hgcmd{status} command afterwards, the
copied file looks just like a normal added file.
\interaction{daily.copy.status}
But if we pass the \hgopt{status}{-C} option to \hgcmd{status}, it
prints another line of output: this is the file that our newly-added
file was copied \emph{from}.
\interaction{daily.copy.status-copy}

Now, back in the repository we cloned, let's make a change in
parallel.  We'll add a line of content to the original file that we
created.
\interaction{daily.copy.other}
Now we have a modified \filename{file} in this repository.  When we
pull the changes from the first repository, and merge the two heads,
Mercurial will propagate the changes that we made locally to
\filename{file} into its copy, \filename{new-file}.
\interaction{daily.copy.merge}

\subsection{Why should changes follow copies?}
\label{sec:daily:why-copy}

This behaviour, of changes to a file propagating out to copies of the
file, might seem esoteric, but in most cases it's highly desirable.

First of all, remember that this propagation \emph{only} happens when
you merge.  So if you \hgcmd{copy} a file, and subsequently modify the
original file during the normal course of your work, nothing will
happen.

The second thing to know is that modifications will only propagate
across a copy as long as the repository that you're pulling changes
from \emph{doesn't know} about the copy.

The reason that Mercurial does this is as follows.  Let's say I make
an important bug fix in a source file, and commit my changes.
Meanwhile, you've decided to \hgcmd{copy} the file in your repository,
without knowing about the bug or having seen the fix, and you have
started hacking on your copy of the file.

If you pulled and merged my changes, and Mercurial \emph{didn't}
propagate changes across copies, your source file would now contain
the bug, and unless you remembered to propagate the bug fix by hand,
the bug would \emph{remain} in your copy of the file.

By automatically propagating the change that fixed the bug from the
original file to the copy, Mercurial prevents this class of problem.
To my knowledge, Mercurial is the \emph{only} revision control system
that propagates changes across copies like this.

Once your change history has a record that the copy and subsequent
merge occurred, there's usually no further need to propagate changes
from the original file to the copied file, and that's why Mercurial
only propagates changes across copies until this point, and no
further.

\subsection{How to make changes \emph{not} follow a copy}

If, for some reason, you decide that this business of automatically
propagating changes across copies is not for you, simply use your
system's normal file copy command (on Unix-like systems, that's
\command{cp}) to make a copy of a file, then \hgcmd{add} the new copy
by hand.  Before you do so, though, please do reread
section~\ref{sec:daily:why-copy}, and make an informed decision that
this behaviour is not appropriate to your specific case.

\subsection{Behaviour of the \hgcmd{copy} command}

When you use the \hgcmd{copy} command, Mercurial makes a copy of each
source file as it currently stands in the working directory.  This
means that if you make some modifications to a file, then \hgcmd{copy}
it without first having committed those changes, the new copy will
also contain the modifications you have made up until that point.  (I
find this behaviour a little counterintuitive, which is why I mention
it here.)

The \hgcmd{copy} command acts similarly to the Unix \command{cp}
command (you can use the \hgcmd{cp} alias if you prefer).  The last
argument is the \emph{destination}, and all prior arguments are
\emph{sources}.  If you pass it a single file as the source, and the
destination does not exist, it creates a new file with that name.
\interaction{daily.copy.simple}
If the destination is a directory, Mercurial copies its sources into
that directory.
\interaction{daily.copy.dir-dest}
Copying a directory is recursive, and preserves the directory
structure of the source.
\interaction{daily.copy.dir-src}
If the source and destination are both directories, the source tree is
recreated in the destination directory.
\interaction{daily.copy.dir-src-dest}

As with the \hgcmd{rename} command, if you copy a file manually and
then want Mercurial to know that you've copied the file, simply use
the \hgopt{--after} option to \hgcmd{copy}.
\interaction{daily.copy.after}

\section{Renaming files}

It's rather more common to need to rename a file than to make a copy
of it.  The reason I discussed the \hgcmd{copy} command before talking
about renaming files is that Mercurial treats a rename in essentially
the same way as a copy.  Therefore, knowing what Mercurial does when
you copy a file tells you what to expect when you rename a file.

When you use the \hgcmd{rename} command, Mercurial makes a copy of
each source file, then deletes it and marks the file as removed.
\interaction{daily.rename.rename}
The \hgcmd{status} command shows the newly copied file as added, and
the copied-from file as removed.
\interaction{daily.rename.status}
As with the results of a \hgcmd{copy}, we must use the
\hgopt{status}{-C} option to \hgcmd{status} to see that the added file
is really being tracked by Mercurial as a copy of the original, now
removed, file.
\interaction{daily.rename.status-copy}

As with \hgcmd{remove} and \hgcmd{copy}, you can tell Mercurial about
a rename after the fact using the \hgopt{rename}{--after} option.  In
most other respects, the behaviour of the \hgcmd{rename} command, and
the options it accepts, are similar to the \hgcmd{copy} command.

\subsection{Renaming files and merging changes}

Since Mercurial's rename is implemented as copy-and-remove, the same
propagation of changes happens when you merge after a rename as after
a copy.

If I modify a file, and you rename it to a new name, and then we merge
our respective changes, my modifications to the file under its
original name will be propagated into the file under its new name.
(This is something you might expect to ``simply work,'' but not all
revision control systems actually do this.)

Whereas having changes follow a copy is a feature where you can
perhaps nod and say ``yes, that might be useful,'' it should be clear
that having them follow a rename is definitely important.  Without
this facility, it would simply be too easy for changes to become
orphaned when files are renamed.

\subsection{Divergent renames and merging}

The case of diverging names occurs when two developers start with a
file---let's call it \filename{foo}---in their respective
repositories.

\interaction{rename.divergent.clone}
Anne renames the file to \filename{bar}.
\interaction{rename.divergent.rename.anne}
Meanwhile, Bob renames it to \filename{quux}.
\interaction{rename.divergent.rename.bob}

I like to think of this as a conflict because each developer has
expressed different intentions about what the file ought to be named.

What do you think should happen when they merge their work?
Mercurial's actual behaviour is that it always preserves \emph{both}
names when it merges changesets that contain divergent renames.
\interaction{rename.divergent.merge}

I personally find this behaviour quite surprising, which is why I
wanted to explicitly mention it here.  I would have expected Mercurial
to prompt me with a three-way choice instead: do I want to keep only
\filename{bar}, only \filename{quux}, or both?

In practice, when you rename a source file, it is likely that you will
also modify another file (such as a makefile) that knows how to build
the source file.  So what will happen if Anne renames a file and edits
\filename{Makefile} to build it under its new name, while Bob does the
same, but chooses a different name for the file, is that after the
merge, there will be two copies of the source file in the working
directory under different names, \emph{and} a conflict in the section
of the \filename{Makefile} that both Bob and Anne edited.

This behaviour is considered surprising by other people, too:
see~\bug{455} for details.

\subsection{Convergent renames and merging}

Another kind of rename conflict occurs when two people choose to
rename different \emph{source} files to the same \emph{destination}.
In this case, Mercurial runs its normal merge machinery, and lets you
guide it to a suitable resolution.

\subsection{Other name-related corner cases}

Mercurial has a longstanding bug in which it fails to handle a merge
where one side has a file with a given name, while another has a
directory with the same name.  This is documented as~\bug{29}.
\interaction{issue29.go}

\section{Recovering from mistakes}

Mercurial has some useful commands that will help you to recover from
some common mistakes.

The \hgcmd{revert} command lets you undo changes that you have made to
your working directory.  For example, if you \hgcmd{add} a file by
accident, just run \hgcmd{revert} with the name of the file you added,
and while the file won't be touched in any way, it won't be tracked
for adding by Mercurial any longer, either.  You can also use
\hgcmd{revert} to get rid of erroneous changes to a file.

It's useful to remember that the \hgcmd{revert} command is useful for
changes that you have not yet committed.  Once you've committed a
change, if you decide it was a mistake, you can still do something
about it, though your options may be more limited.

For more information about the \hgcmd{revert} command, and details
about how to deal with changes you have already committed, see
chapter~\ref{chap:undo}.

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "00book"
%%% End:
author	Bryan O'Sullivan <bos@serpentine.com>
date	Mon, 16 Apr 2007 17:21:38 -0700
parents	c005d0181c0b
children	abebe72451d6