# HG changeset patch # User Yoshiki Yazawa # Date 1247160737 -32400 # Node ID 896ab6eaf1c6b6d6028ffdec642767f154bf28b1 # Parent 5276f40fca1c9ffc02c903192d619db4729cd8f7# Parent 5225ec140003740b7b4bf745bc901276f6dc7c48 merged diff -r 5276f40fca1c -r 896ab6eaf1c6 .hgtags --- a/.hgtags Thu Jul 09 13:32:44 2009 +0900 +++ b/.hgtags Fri Jul 10 02:32:17 2009 +0900 @@ -2,3 +2,4 @@ b0db5adf11c1e096c4b08f42befb8e7e18120ed0 ja_root 0000000000000000000000000000000000000000 japanese root ec889c068d461b4f9be324d6f2f224708b85c7cb finished draft translation +18131160f7ee3b81bf39ce2c58f762b8d671cef3 submitted diff -r 5276f40fca1c -r 896ab6eaf1c6 en/00book.xml --- a/en/00book.xml Thu Jul 09 13:32:44 2009 +0900 +++ b/en/00book.xml Fri Jul 10 02:32:17 2009 +0900 @@ -8,20 +8,21 @@ - - - - - - - - - - - - - - + + + + + + + + + + + + + + + @@ -96,6 +97,10 @@ &ch12; &ch13; + + &ch14; + + &appA; &appB; diff -r 5276f40fca1c -r 896ab6eaf1c6 en/Makefile --- a/en/Makefile Thu Jul 09 13:32:44 2009 +0900 +++ b/en/Makefile Fri Jul 10 02:32:17 2009 +0900 @@ -35,7 +35,6 @@ filenames \ hook.msglen \ hook.simple \ - hook.ws \ issue29 \ mq.guards \ mq.qinit-help \ diff -r 5276f40fca1c -r 896ab6eaf1c6 en/appA-cmdref.xml --- a/en/appA-cmdref.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,224 +0,0 @@ - - - - -Command reference - -\cmdref{add}{add files at the next commit} -\optref{add}{I}{include} -\optref{add}{X}{exclude} -\optref{add}{n}{dry-run} - -\cmdref{diff}{print changes in history or working directory} - -Show differences between revisions for the specified files or -directories, using the unified diff format. For a description of the -unified diff format, see . - -By default, this command does not print diffs for files that Mercurial -considers to contain binary data. To control this behavior, see the - and options. - - -Options -x -\loptref{diff}{nodates} - -Omit date and time information when printing diff headers. - -\optref{diff}{B}{ignore-blank-lines} - -Do not print changes that only insert or delete blank lines. A line -that contains only whitespace is not considered blank. - - -\optref{diff}{I}{include} - - -Include files and directories whose names match the given patterns. - - -\optref{diff}{X}{exclude} - - -Exclude files and directories whose names match the given patterns. - - -\optref{diff}{a}{text} - - -If this option is not specified, hg diff will refuse to print -diffs for files that it detects as binary. Specifying -forces hg diff to treat all files as text, and generate diffs for -all of them. - - -This option is useful for files that are mostly text but have a -few embedded NUL characters. If you use it on files that contain a -lot of binary data, its output will be incomprehensible. - - -\optref{diff}{b}{ignore-space-change} - - -Do not print a line if the only change to that line is in the amount -of white space it contains. - - -\optref{diff}{g}{git} - - -Print git-compatible diffs. XXX reference a format -description. - - -\optref{diff}{p}{show-function} - - -Display the name of the enclosing function in a hunk header, using a -simple heuristic. This functionality is enabled by default, so the - option has no effect unless you change the value of -the showfunc config item, as in the following example. - - - -\optref{diff}{r}{rev} - - -Specify one or more revisions to compare. The hg diff command -accepts up to two options to specify the revisions to -compare. - - - -Display the differences between the parent revision of the - working directory and the working directory. - - -Display the differences between the specified changeset and the - working directory. - - -Display the differences between the two specified changesets. - - - -You can specify two revisions using either two -options or revision range notation. For example, the two revision -specifications below are equivalent. - -hg diff -r 10 -r 20 -hg diff -r10:20 - -When you provide two revisions, Mercurial treats the order of those -revisions as significant. Thus, hg diff -r10:20 will -produce a diff that will transform files from their contents as of -revision 10 to their contents as of revision 20, while -hg diff -r20:10 means the opposite: the diff that will -transform files from their revision 20 contents to their revision 10 -contents. You cannot reverse the ordering in this way if you are -diffing against the working directory. - - -\optref{diff}{w}{ignore-all-space} - - -\cmdref{version}{print version and copyright information} - - -This command displays the version of Mercurial you are running, and -its copyright license. There are four kinds of version string that -you may see. - - -The string unknown. This version of Mercurial was - not built in a Mercurial repository, and cannot determine its own - version. - - -A short numeric string, such as 1.1. This is a - build of a revision of Mercurial that was identified by a specific - tag in the repository where it was built. (This doesn't necessarily - mean that you're running an official release; someone else could - have added that tag to any revision in the repository where they - built Mercurial.) - - -A hexadecimal string, such as 875489e31abe. This - is a build of the given revision of Mercurial. - - -A hexadecimal string followed by a date, such as - 875489e31abe+20070205. This is a build of the given - revision of Mercurial, where the build repository contained some - local changes that had not been committed. - - - - - -Tips and tricks - - -Why do the results of <command role="hg-cmd">hg diff</command> and <command role="hg-cmd">hg status</command> differ? - -When you run the hg status command, you'll see a list of files -that Mercurial will record changes for the next time you perform a -commit. If you run the hg diff command, you may notice that it -prints diffs for only a subset of the files that hg status -listed. There are two possible reasons for this. - - -The first is that hg status prints some kinds of modifications -that hg diff doesn't normally display. The hg diff command -normally outputs unified diffs, which don't have the ability to -represent some changes that Mercurial can track. Most notably, -traditional diffs can't represent a change in whether or not a file is -executable, but Mercurial records this information. - - -If you use the option to hg diff, it will -display git-compatible diffs that can display this -extra information. - - -The second possible reason that hg diff might be printing diffs -for a subset of the files displayed by hg status is that if you -invoke it without any arguments, hg diff prints diffs against the -first parent of the working directory. If you have run hg merge -to merge two changesets, but you haven't yet committed the results of -the merge, your working directory has two parents (use hg parents -to see them). While hg status prints modifications relative to -both parents after an uncommitted merge, hg diff still -operates relative only to the first parent. You can get it to print -diffs relative to the second parent by specifying that parent with the - option. There is no way to print diffs relative to -both parents. - - - - -Generating safe binary diffs - -If you use the option to force Mercurial to print -diffs of files that are either mostly text or contain lots of -binary data, those diffs cannot subsequently be applied by either -Mercurial's hg import command or the system's patch -command. - - -If you want to generate a diff of a binary file that is safe to use as -input for hg import, use the hg diff{--git} option when you -generate the patch. The system patch command cannot handle -binary patches at all. - - - - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/appA-svn.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/appA-svn.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,540 @@ + + + + +Migrating to Mercurial + + A common way to test the waters with a new revision control + tool is to experiment with switching an existing project, rather + than starting a new project from scratch. + + In this appendix, we discuss how to import a project's history + into Mercurial, and what to look out for if you are used to a + different revision control system. + + + Importing history from another system + + Mercurial ships with an extension named + convert, which can import project history + from most popular revision control systems. At the time this + book was written, it could import history from the following + systems: + + + Subversion + + + CVS + + + git + + + Darcs + + + Bazaar + + + Monotone + + + GNU Arch + + + Mercurial + + + + (To see why Mercurial itself is supported as a source, see + .) + + You can enable the extension in the usual way, by editing + your ~/.hgrc file. + + [extensions] +convert = + + This will make a hg convert command + available. The command is easy to use. For instance, this + command will import the Subversion history for the Nose unit + testing framework into Mercurial. + + $ hg convert http://python-nose.googlecode.com/svn/trunk + + The convert extension operates + incrementally. In other words, after you have run hg + convert once, running it again will import any new + revisions committed after the first run began. Incremental + conversion will only work if you run hg + convert in the same Mercurial repository that you + originally used, because the convert + extension saves some private metadata in a + non-revision-controlled file named + .hg/shamap inside the target + repository. + + When you want to start making changes using Mercurial, it's + best to clone the tree in which you are doing your conversions, + and leave the original tree for future incremental conversions. + This is the safest way to let you pull and merge future commits + from the source revision control system into your newly active + Mercurial project. + + + Converting multiple branches + + The hg convert command given above + converts only the history of the trunk + branch of the Subversion repository. If we instead use the + URL http://python-nose.googlecode.com/svn, + Mercurial will automatically detect the + trunk, tags and + branches layout that Subversion projects + usually use, and it will import each as a separate Mercurial + branch. + + By default, each Subversion branch imported into Mercurial + is given a branch name. After the conversion completes, you + can get a list of the active branch names in the Mercurial + repository using hg branches -a. If you + would prefer to import the Subversion branches without names, + pass the option to + hg convert. + + Once you have converted your tree, if you want to follow + the usual Mercurial practice of working in a tree that + contains a single branch, you can clone that single branch + using hg clone -r mybranchname. + + + + Mapping user names + + Some revision control tools save only short usernames with + commits, and these can be difficult to interpret. The norm + with Mercurial is to save a committer's name and email + address, which is much more useful for talking to them after + the fact. + + If you are converting a tree from a revision control + system that uses short names, you can map those names to + longer equivalents by passing a + option to hg convert. This option accepts + a file name that should contain entries of the following + form. + + arist = Aristotle <aristotle@phil.example.gr> +soc = Socrates <socrates@phil.example.gr> + + Whenever convert encounters a commit + with the username arist in the source + repository, it will use the name Aristotle + <aristotle@phil.example.gr> in the converted + Mercurial revision. If no match is found for a name, it is + used verbatim. + + + + Tidying up the tree + + Not all projects have pristine history. There may be a + directory that should never have been checked in, a file that + is too big, or a whole hierarchy that needs to be + refactored. + + The convert extension supports the idea + of a file map that can reorganize the files and + directories in a project as it imports the project's history. + This is useful not only when importing history from other + revision control systems, but also to prune or refactor a + Mercurial tree. + + To specify a file map, use the + option and supply a file name. A file map contains lines of the + following forms. + + # This is a comment. +# Empty lines are ignored. + +include path/to/file + +exclude path/to/file + +rename from/some/path to/some/other/place + + + The include directive causes a file, or + all files under a directory, to be included in the destination + repository. This also excludes all other files and dirs not + explicitely included. The exclude + directive causes files or directories to be omitted, and + others not explicitly mentioned to be included. + + To move a file or directory from one location to another, + use the rename directive. If you need to + move a file or directory from a subdirectory into the root of + the repository, use . as the second + argument to the rename directive. + + + + Improving Subversion conversion performance + + You will often need several attempts before you hit the + perfect combination of user map, file map, and other + conversion parameters. Converting a Subversion repository + over an access protocol like ssh or + http can proceed thousands of times more + slowly than Mercurial is capable of actually operating, due to + network delays. This can make tuning that perfect conversion + recipe very painful. + + The svnsync + command can greatly speed up the conversion of a Subversion + repository. It is a read-only mirroring program for + Subversion repositories. The idea is that you create a local + mirror of your Subversion tree, then convert the mirror into a + Mercurial repository. + + Suppose we want to convert the Subversion repository for + the popular Memcached project into a Mercurial tree. First, + we create a local Subversion repository. + + $ svnadmin create memcached-mirror + + Next, we set up a Subversion hook that + svnsync needs. + + $ echo '#!/bin/sh' > memcached-mirror/hooks/pre-revprop-change +$ chmod +x memcached-mirror/hooks/pre-revprop-change + + We then initialize svnsync in this + repository. + + $ svnsync --init file://`pwd`/memcached-mirror \ + http://code.sixapart.com/svn/memcached + + Our next step is to begin the svnsync + mirroring process. + + $ svnsync sync file://`pwd`/memcached-mirror + + Finally, we import the history of our local Subversion + mirror into Mercurial. + + $ hg convert memcached-mirror + + We can use this process incrementally if the Subversion + repository is still in use. We run svnsync + to pull new changes into our mirror, then hg + convert to import them into our Mercurial + tree. + + There are two advantages to doing a two-stage import with + svnsync. The first is that it uses more + efficient Subversion network syncing code than hg + convert, so it transfers less data over the + network. The second is that the import from a local + Subversion tree is so fast that you can tweak your conversion + setup repeatedly without having to sit through a painfully + slow network-based conversion process each time. + + + + + Migrating from Subversion + + Subversion is currently the most popular open source + revision control system. Although there are many differences + between Mercurial and Subversion, making the transition from + Subversion to Mercurial is not particularly difficult. The two + have similar command sets and generally uniform + interfaces. + + + Philosophical differences + + The fundamental difference between Subversion and + Mercurial is of course that Subversion is centralized, while + Mercurial is distributed. Since Mercurial stores all of a + project's history on your local drive, it only needs to + perform a network access when you want to explicitly + communicate with another repository. In contrast, Subversion + stores very little information locally, and the client must + thus contact its server for many common operations. + + Subversion more or less gets away without a well-defined + notion of a branch: which portion of a server's namespace + qualifies as a branch is a matter of convention, with the + software providing no enforcement. Mercurial treats a + repository as the unit of branch management. + + + Scope of commands + + Since Subversion doesn't know what parts of its + namespace are really branches, it treats most commands as + requests to operate at and below whatever directory you are + currently visiting. For instance, if you run svn + log, you'll get the history of whatever part of + the tree you're looking at, not the tree as a whole. + + Mercurial's commands behave differently, by defaulting + to operating over an entire repository. Run hg + log and it will tell you the history of the + entire tree, no matter what part of the working directory + you're visiting at the time. If you want the history of + just a particular file or directory, simply supply it by + name, e.g. hg log src. + + From my own experience, this difference in default + behaviors is probably the most likely to trip you up if you + have to switch back and forth frequently between the two + tools. + + + + Multi-user operation and safety + + With Subversion, it is normal (though slightly frowned + upon) for multiple people to collaborate in a single branch. + If Alice and Bob are working together, and Alice commits + some changes to their shared branch, Bob must update his + client's view of the branch before he can commit. Since at + this time he has no permanent record of the changes he has + made, he can corrupt or lose his modifications during and + after his update. + + Mercurial encourages a commit-then-merge model instead. + Bob commits his changes locally before pulling changes from, + or pushing them to, the server that he shares with Alice. + If Alice pushed her changes before Bob tries to push his, he + will not be able to push his changes until he pulls hers, + merges with them, and commits the result of the merge. If + he makes a mistake during the merge, he still has the option + of reverting to the commit that recorded his changes. + + It is worth emphasizing that these are the common ways + of working with these tools. Subversion supports a safer + work-in-your-own-branch model, but it is cumbersome enough + in practice to not be widely used. Mercurial can support + the less safe mode of allowing changes to be pulled in and + merged on top of uncommitted edits, but this is considered + highly unusual. + + + + Published vs local changes + + A Subversion svn commit command + immediately publishes changes to a server, where they can be + seen by everyone who has read access. + + With Mercurial, commits are always local, and must be + published via a hg push command + afterwards. + + Each approach has its advantages and disadvantages. The + Subversion model means that changes are published, and hence + reviewable and usable, immediately. On the other hand, this + means that a user must have commit access to a repository in + order to use the software in a normal way, and commit access + is not lightly given out by most open source + projects. + + The Mercurial approach allows anyone who can clone a + repository to commit changes without the need for someone + else's permission, and they can then publish their changes + and continue to participate however they see fit. The + distinction between committing and pushing does open up the + possibility of someone committing changes to their laptop + and walking away for a few days having forgotten to push + them, which in rare cases might leave collaborators + temporarily stuck. + + + + + Quick reference + + + Subversion commands and Mercurial equivalents + + + + Subversion + Mercurial + Notes + + + + + svn add + hg add + + + + svn blame + hg annotate + + + + svn cat + hg cat + + + + svn checkout + hg clone + + + + svn cleanup + n/a + No cleanup needed + + + svn commit + hg commit; hg + push + hg push publishes after + commit + + + svn copy + hg clone + To create a new branch + + + svn copy + hg copy + To copy files or directories + + + svn delete (svn + remove) + hg remove + + + + svn diff + hg diff + + + + svn export + hg archive + + + + svn help + hg help + + + + svn import + hg addremove; hg + commit + + + + svn info + hg parents + Shows what revision is checked out + + + svn info + hg showconfig + paths.parent + Shows what URL is checked out + + + svn list + hg manifest + + + + svn log + hg log + + + + svn merge + hg merge + + + + svn mkdir + n/a + Mercurial does not track directories + + + svn move (svn + rename) + hg rename + + + + svn resolved + hg resolve -m + + + + svn revert + hg revert + + + + svn status + hg status + + + + svn update + hg pull -u + + + + +
+
+
+ + + Useful tips for newcomers + + Under some revision control systems, printing a diff for a + single committed revision can be painful. For instance, with + Subversion, to see what changed in revision 104654, you must + type svn diff -r104653:104654. Mercurial + eliminates the need to type the revision ID twice in this common + case. For a plain diff, hg export 104654. For + a log message followed by a diff, hg log -r104654 + -p. + + When you run hg status without any + arguments, it prints the status of the entire tree, with paths + relative to the root of the repository. This makes it tricky to + copy a file name from the output of hg status + into the command line. If you supply a file or directory name + to hg status, it will print paths relative to + your current location instead. So to get tree-wide status from + hg status, with paths that are relative to + your current directory and not the root of the repository, feed + the output of hg root into hg + status. You can easily do this as follows on a + Unix-like system: + + $ hg status `hg root` + +
+ + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/appB-mq-ref.xml --- a/en/appB-mq-ref.xml Thu Jul 09 13:32:44 2009 +0900 +++ b/en/appB-mq-ref.xml Fri Jul 10 02:32:17 2009 +0900 @@ -72,6 +72,16 @@ + <command role="hg-ext-mq">qfold</command>&emdash;move + applied patches into repository history + + The hg qfinish command converts the + specified applied patches into permanent changes by moving + them out of MQ's control so that they will be treated as + normal repository history. + + + <command role="hg-ext-mq">qfold</command>&emdash;merge (<quote>fold</quote>) several patches into one @@ -328,8 +338,8 @@ no such text, a default commit message is used that identifies the name of the patch. - If a patch contains a Mercurial patch header (XXX add - link), the information in the patch header overrides these + If a patch contains a Mercurial patch header, + the information in the patch header overrides these defaults. Options: @@ -435,21 +445,6 @@ - <command - role="hg-ext-mq">qrestore</command>&emdash;restore saved - queue state - - XXX No idea what this does. - - - - <command role="hg-ext-mq">qsave</command>&emdash;save - current queue state - - XXX Likewise. - - - <command role="hg-ext-mq">qseries</command>&emdash;print the entire patch series @@ -501,9 +496,7 @@ changesets in the backup bundle. : If a - branch has multiple heads, remove all heads. XXX This - should be renamed, and use -f to strip - revs when there are pending changes. + branch has multiple heads, remove all heads. : Do not save a backup bundle. diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch00-preface.xml --- a/en/ch00-preface.xml Thu Jul 09 13:32:44 2009 +0900 +++ b/en/ch00-preface.xml Fri Jul 10 02:32:17 2009 +0900 @@ -5,751 +5,256 @@ Preface - Why revision control? Why Mercurial? + Technical storytelling - Revision control is the process of managing multiple - versions of a piece of information. In its simplest form, this - is something that many people do by hand: every time you modify - a file, save it under a new name that contains a number, each - one higher than the number of the preceding version. + A few years ago, when I wanted to explain why I believed + that distributed revision control is important, the field was + then so new that there was almost no published literature to + refer people to. - Manually managing multiple versions of even a single file is - an error-prone task, though, so software tools to help automate - this process have long been available. The earliest automated - revision control tools were intended to help a single user to - manage revisions of a single file. Over the past few decades, - the scope of revision control tools has expanded greatly; they - now manage multiple files, and help multiple people to work - together. The best modern revision control tools have no - problem coping with thousands of people working together on - projects that consist of hundreds of thousands of files. - - The arrival of distributed revision control is relatively - recent, and so far this new field has grown due to people's - willingness to explore ill-charted territory. - - I am writing a book about distributed revision control - because I believe that it is an important subject that deserves - a field guide. I chose to write about Mercurial because it is - the easiest tool to learn the terrain with, and yet it scales to - the demands of real, challenging environments where many other - revision control tools buckle. - - - Why use revision control? - - There are a number of reasons why you or your team might - want to use an automated revision control tool for a - project. + Although at that time I spent some time working on the + internals of Mercurial itself, I switched to writing this book + because that seemed like the most effective way to help the + software to reach a wide audience, along with the idea that + revision control ought to be distributed in nature. I publish + the book online under a liberal license for the same + reason: to get the word out. - - It will track the history and evolution of - your project, so you don't have to. For every change, - you'll have a log of who made it; - why they made it; - when they made it; and - what the change - was. - When you're working with other people, - revision control software makes it easier for you to - collaborate. For example, when people more or less - simultaneously make potentially incompatible changes, the - software will help you to identify and resolve those - conflicts. - It can help you to recover from mistakes. If - you make a change that later turns out to be in error, you - can revert to an earlier version of one or more files. In - fact, a really good revision control - tool will even help you to efficiently figure out exactly - when a problem was introduced (see for details). - It will help you to work simultaneously on, - and manage the drift between, multiple versions of your - project. - - - Most of these reasons are equally - valid&emdash;at least in theory&emdash;whether you're working - on a project by yourself, or with a hundred other - people. - - A key question about the practicality of revision control - at these two different scales (lone hacker and - huge team) is how its - benefits compare to its - costs. A revision control tool that's - difficult to understand or use is going to impose a high - cost. + There's a familiar rhythm to a good software book that + closely resembles telling a story: What is this thing? Why does + it matter? How will it help me? How do I use it? In this + book, I try to answer those questions for distributed revision + control in general, and for Mercurial in particular. + + + + Thank you for supporting Mercurial - A five-hundred-person project is likely to collapse under - its own weight almost immediately without a revision control - tool and process. In this case, the cost of using revision - control might hardly seem worth considering, since - without it, failure is almost - guaranteed. - - On the other hand, a one-person quick hack - might seem like a poor place to use a revision control tool, - because surely the cost of using one must be close to the - overall cost of the project. Right? - - Mercurial uniquely supports both of - these scales of development. You can learn the basics in just - a few minutes, and due to its low overhead, you can apply - revision control to the smallest of projects with ease. Its - simplicity means you won't have a lot of abstruse concepts or - command sequences competing for mental space with whatever - you're really trying to do. At the same - time, Mercurial's high performance and peer-to-peer nature let - you scale painlessly to handle large projects. - - No revision control tool can rescue a poorly run project, - but a good choice of tools can make a huge difference to the - fluidity with which you can work on a project. - - - - - The many names of revision control - - Revision control is a diverse field, so much so that it is - referred to by many names and acronyms. Here are a few of the - more common variations you'll encounter: - - Revision control (RCS) - Software configuration management (SCM), or - configuration management - Source code management - Source code control, or source - control - Version control - (VCS) - Some people claim that these terms actually have different - meanings, but in practice they overlap so much that there's no - agreed or even useful way to tease them apart. - - + By purchasing a copy of this book, you are supporting the + continued development and freedom of Mercurial in particular, + and of open source and free software in general. O'Reilly Media + and I are donating my royalties on the sales of this book to the + Software Freedom Conservancy (http://www.softwarefreedom.org/) + which provides clerical and legal support to Mercurial and a + number of other prominent and worthy open source software + projects. - This book is a work in progress + Acknowledgments - I am releasing this book while I am still writing it, in the - hope that it will prove useful to others. I am writing under an - open license in the hope that you, my readers, will contribute - feedback and perhaps content of your own. + This book would not exist were it not for the efforts of + Matt Mackall, the author and project lead of Mercurial. He is + ably assisted by hundreds of volunteer contributors across the + world. + + My children, Cian and Ruairi, always stood ready to help me + to unwind with wonderful, madcap little-boy games. I'd also + like to thank my ex-wife, Shannon, for her support. - - - About the examples in this book + My colleagues and friends provided help and support in + innumerable ways. This list of people is necessarily very + incomplete: Stephen Hahn, Karyn Ritter, Bonnie Corwin, James + Vasile, Matt Norwood, Eben Moglen, Bradley Kuhn, Robert Walsh, + Jeremy Fitzhardinge, Rachel Chalmers. - This book takes an unusual approach to code samples. Every - example is live&emdash;each one is actually the result - of a shell script that executes the Mercurial commands you see. - Every time an image of the book is built from its sources, all - the example scripts are automatically run, and their current - results compared against their expected results. + I developed this book in the open, posting drafts of + chapters to the book web site as I completed them. Readers then + submitted feedback using a web application that I developed. By + the time I finished writing the book, more than 100 people had + submitted comments, an amazing number considering that the + comment system was live for only about two months towards the + end of the writing process. + + I would particularly like to recognize the following people, + who between them contributed over a third of the total number of + comments. I would like to thank them for their care and effort + in providing so much detailed feedback. - The advantage of this approach is that the examples are - always accurate; they describe exactly the - behavior of the version of Mercurial that's mentioned at the - front of the book. If I update the version of Mercurial that - I'm documenting, and the output of some command changes, the - build fails. + Martin Geisler, Damien Cassou, Alexey Bakhirkin, Till Plewe, + Dan Himes, Paul Sargent, Gokberk Hamurcu, Matthijs van der + Vleuten, Michael Chermside, John Mulligan, Jordi Fita, Jon + Parise. + + I also want to acknowledge the help of the many people who + caught errors and provided helpful suggestions throughout the + book. - There is a small disadvantage to this approach, which is - that the dates and times you'll see in examples tend to be - squashed together in a way that they wouldn't be - if the same commands were being typed by a human. Where a human - can issue no more than one command every few seconds, with any - resulting timestamps correspondingly spread out, my automated - example scripts run many commands in one second. - - As an instance of this, several consecutive commits in an - example can show up as having occurred during the same second. - You can see this occur in the bisect example in , for instance. - - So when you're reading examples, don't place too much weight - on the dates or times you see in the output of commands. But - do be confident that the behavior you're - seeing is consistent and reproducible. - + Jeremy W. Sherman, Brian Mearns, Vincent Furia, Iwan + Luijks, Billy Edwards, Andreas Sliwka, Paweł Sołyga, Eric + Hanchrow, Steve Nicolai, Michał Masłowski, Kevin Fitch, Johan + Holmberg, Hal Wine, Volker Simonis, Thomas P Jakobsen, Ted + Stresen-Reuter, Stephen Rasku, Raphael Das Gupta, Ned + Batchelder, Lou Keeble, Li Linxiao, Kao Cardoso Félix, Joseph + Wecker, Jon Prescot, Jon Maken, John Yeary, Jason Harris, + Geoffrey Zheng, Fredrik Jonson, Ed Davies, David Zumbrunnen, + David Mercer, David Cabana, Ben Karel, Alan Franzoni, Yousry + Abdallah, Whitney Young, Vinay Sajip, Tom Towle, Tim Ottinger, + Thomas Schraitle, Tero Saarni, Ted Mielczarek, Svetoslav + Agafonkin, Shaun Rowland, Rocco Rutte, Polo-Francois Poli, + Philip Jenvey, Petr Tesałék, Peter R. Annema, Paul Bonser, + Olivier Scherler, Olivier Fournier, Nick Parker, Nick Fabry, + Nicholas Guarracino, Mike Driscoll, Mike Coleman, Mietek Bák, + Michael Maloney, László Nagy, Kent Johnson, Julio Nobrega, Jord + Fita, Jonathan March, Jonas Nockert, Jim Tittsler, Jeduan + Cornejo Legorreta, Jan Larres, James Murphy, Henri Wiechers, + Hagen Möbius, Gábor Farkas, Fabien Engels, Evert Rol, Evan + Willms, Eduardo Felipe Castegnaro, Dennis Decker Jensen, Deniz + Dogan, David Smith, Daed Lee, Christine Slotty, Charles Merriam, + Guillaume Catto, Brian Dorsey, Bob Nystrom, Benoit Boissinot, + Avi Rosenschein, Andrew Watts, Andrew Donkin, Alexey Rodriguez, + Ahmed Chaudhary. - Trends in the field - - There has been an unmistakable trend in the development and - use of revision control tools over the past four decades, as - people have become familiar with the capabilities of their tools - and constrained by their limitations. - - The first generation began by managing single files on - individual computers. Although these tools represented a huge - advance over ad-hoc manual revision control, their locking model - and reliance on a single computer limited them to small, - tightly-knit teams. - - The second generation loosened these constraints by moving - to network-centered architectures, and managing entire projects - at a time. As projects grew larger, they ran into new problems. - With clients needing to talk to servers very frequently, server - scaling became an issue for large projects. An unreliable - network connection could prevent remote users from being able to - talk to the server at all. As open source projects started - making read-only access available anonymously to anyone, people - without commit privileges found that they could not use the - tools to interact with a project in a natural way, as they could - not record their changes. - - The current generation of revision control tools is - peer-to-peer in nature. All of these systems have dropped the - dependency on a single central server, and allow people to - distribute their revision control data to where it's actually - needed. Collaboration over the Internet has moved from - constrained by technology to a matter of choice and consensus. - Modern tools can operate offline indefinitely and autonomously, - with a network connection only needed when syncing changes with - another repository. - - - - A few of the advantages of distributed revision - control - - Even though distributed revision control tools have for - several years been as robust and usable as their - previous-generation counterparts, people using older tools have - not yet necessarily woken up to their advantages. There are a - number of ways in which distributed tools shine relative to - centralised ones. - - For an individual developer, distributed tools are almost - always much faster than centralised tools. This is for a simple - reason: a centralised tool needs to talk over the network for - many common operations, because most metadata is stored in a - single copy on the central server. A distributed tool stores - all of its metadata locally. All else being equal, talking over - the network adds overhead to a centralised tool. Don't - underestimate the value of a snappy, responsive tool: you're - going to spend a lot of time interacting with your revision - control software. + Conventions Used in This Book - Distributed tools are indifferent to the vagaries of your - server infrastructure, again because they replicate metadata to - so many locations. If you use a centralised system and your - server catches fire, you'd better hope that your backup media - are reliable, and that your last backup was recent and actually - worked. With a distributed tool, you have many backups - available on every contributor's computer. - - The reliability of your network will affect distributed - tools far less than it will centralised tools. You can't even - use a centralised tool without a network connection, except for - a few highly constrained commands. With a distributed tool, if - your network connection goes down while you're working, you may - not even notice. The only thing you won't be able to do is talk - to repositories on other computers, something that is relatively - rare compared with local operations. If you have a far-flung - team of collaborators, this may be significant. - - - Advantages for open source projects + The following typographical conventions are used in this + book: - If you take a shine to an open source project and decide - that you would like to start hacking on it, and that project - uses a distributed revision control tool, you are at once a - peer with the people who consider themselves the - core of that project. If they publish their - repositories, you can immediately copy their project history, - start making changes, and record your work, using the same - tools in the same ways as insiders. By contrast, with a - centralised tool, you must use the software in a read - only mode unless someone grants you permission to - commit changes to their central server. Until then, you won't - be able to record changes, and your local modifications will - be at risk of corruption any time you try to update your - client's view of the repository. - - - The forking non-problem - - It has been suggested that distributed revision control - tools pose some sort of risk to open source projects because - they make it easy to fork the development of - a project. A fork happens when there are differences in - opinion or attitude between groups of developers that cause - them to decide that they can't work together any longer. - Each side takes a more or less complete copy of the - project's source code, and goes off in its own - direction. - - Sometimes the camps in a fork decide to reconcile their - differences. With a centralised revision control system, the - technical process of reconciliation is - painful, and has to be performed largely by hand. You have - to decide whose revision history is going to - win, and graft the other team's changes into - the tree somehow. This usually loses some or all of one - side's revision history. + + + Italic - What distributed tools do with respect to forking is - they make forking the only way to - develop a project. Every single change that you make is - potentially a fork point. The great strength of this - approach is that a distributed revision control tool has to - be really good at merging forks, - because forks are absolutely fundamental: they happen all - the time. - - If every piece of work that everybody does, all the - time, is framed in terms of forking and merging, then what - the open source world refers to as a fork - becomes purely a social issue. If - anything, distributed tools lower the - likelihood of a fork: - - They eliminate the social distinction that - centralised tools impose: that between insiders (people - with commit access) and outsiders (people - without). - They make it easier to reconcile after a - social fork, because all that's involved from the - perspective of the revision control software is just - another merge. - - Some people resist distributed tools because they want - to retain tight control over their projects, and they - believe that centralised tools give them this control. - However, if you're of this belief, and you publish your CVS - or Subversion repositories publicly, there are plenty of - tools available that can pull out your entire project's - history (albeit slowly) and recreate it somewhere that you - don't control. So while your control in this case is - illusory, you are forgoing the ability to fluidly - collaborate with whatever people feel compelled to mirror - and fork your history. - - - - - Advantages for commercial projects - - Many commercial projects are undertaken by teams that are - scattered across the globe. Contributors who are far from a - central server will see slower command execution and perhaps - less reliability. Commercial revision control systems attempt - to ameliorate these problems with remote-site replication - add-ons that are typically expensive to buy and cantankerous - to administer. A distributed system doesn't suffer from these - problems in the first place. Better yet, you can easily set - up multiple authoritative servers, say one per site, so that - there's no redundant communication between repositories over - expensive long-haul network links. + + Indicates new terms, URLs, email addresses, filenames, + and file extensions. + + - Centralised revision control systems tend to have - relatively low scalability. It's not unusual for an expensive - centralised system to fall over under the combined load of - just a few dozen concurrent users. Once again, the typical - response tends to be an expensive and clunky replication - facility. Since the load on a central server&emdash;if you have - one at all&emdash;is many times lower with a distributed tool - (because all of the data is replicated everywhere), a single - cheap server can handle the needs of a much larger team, and - replication to balance load becomes a simple matter of - scripting. - - If you have an employee in the field, troubleshooting a - problem at a customer's site, they'll benefit from distributed - revision control. The tool will let them generate custom - builds, try different fixes in isolation from each other, and - search efficiently through history for the sources of bugs and - regressions in the customer's environment, all without needing - to connect to your company's network. - - - - - Why choose Mercurial? + + Constant width - Mercurial has a unique set of properties that make it a - particularly good choice as a revision control system. - - It is easy to learn and use. - It is lightweight. - It scales excellently. - It is easy to - customise. - - If you are at all familiar with revision control systems, - you should be able to get up and running with Mercurial in less - than five minutes. Even if not, it will take no more than a few - minutes longer. Mercurial's command and feature sets are - generally uniform and consistent, so you can keep track of a few - general rules instead of a host of exceptions. - - On a small project, you can start working with Mercurial in - moments. Creating new changes and branches; transferring changes - around (whether locally or over a network); and history and - status operations are all fast. Mercurial attempts to stay - nimble and largely out of your way by combining low cognitive - overhead with blazingly fast operations. - - The usefulness of Mercurial is not limited to small - projects: it is used by projects with hundreds to thousands of - contributors, each containing tens of thousands of files and - hundreds of megabytes of source code. - - If the core functionality of Mercurial is not enough for - you, it's easy to build on. Mercurial is well suited to - scripting tasks, and its clean internals and implementation in - Python make it easy to add features in the form of extensions. - There are a number of popular and useful extensions already - available, ranging from helping to identify bugs to improving - performance. - - - - Mercurial compared with other tools + + Used for program listings, as well as within + paragraphs to refer to program elements such as variable + or function names, databases, data types, environment + variables, statements, and keywords. + + - Before you read on, please understand that this section - necessarily reflects my own experiences, interests, and (dare I - say it) biases. I have used every one of the revision control - tools listed below, in most cases for several years at a - time. - - - - Subversion - - Subversion is a popular revision control tool, developed - to replace CVS. It has a centralised client/server - architecture. - - Subversion and Mercurial have similarly named commands for - performing the same operations, so if you're familiar with - one, it is easy to learn to use the other. Both tools are - portable to all popular operating systems. - - Prior to version 1.5, Subversion had no useful support for - merges. At the time of writing, its merge tracking capability - is new, and known to be complicated - and buggy. - - Mercurial has a substantial performance advantage over - Subversion on every revision control operation I have - benchmarked. I have measured its advantage as ranging from a - factor of two to a factor of six when compared with Subversion - 1.4.3's ra_local file store, which is the - fastest access method available. In more realistic - deployments involving a network-based store, Subversion will - be at a substantially larger disadvantage. Because many - Subversion commands must talk to the server and Subversion - does not have useful replication facilities, server capacity - and network bandwidth become bottlenecks for modestly large - projects. - - Additionally, Subversion incurs substantial storage - overhead to avoid network transactions for a few common - operations, such as finding modified files - (status) and displaying modifications - against the current revision (diff). As a - result, a Subversion working copy is often the same size as, - or larger than, a Mercurial repository and working directory, - even though the Mercurial repository contains a complete - history of the project. - - Subversion is widely supported by third party tools. - Mercurial currently lags considerably in this area. This gap - is closing, however, and indeed some of Mercurial's GUI tools - now outshine their Subversion equivalents. Like Mercurial, - Subversion has an excellent user manual. + + Constant width bold - Because Subversion doesn't store revision history on the - client, it is well suited to managing projects that deal with - lots of large, opaque binary files. If you check in fifty - revisions to an incompressible 10MB file, Subversion's - client-side space usage stays constant The space used by any - distributed SCM will grow rapidly in proportion to the number - of revisions, because the differences between each revision - are large. - - In addition, it's often difficult or, more usually, - impossible to merge different versions of a binary file. - Subversion's ability to let a user lock a file, so that they - temporarily have the exclusive right to commit changes to it, - can be a significant advantage to a project where binary files - are widely used. - - Mercurial can import revision history from a Subversion - repository. It can also export revision history to a - Subversion repository. This makes it easy to test the - waters and use Mercurial and Subversion in parallel - before deciding to switch. History conversion is incremental, - so you can perform an initial conversion, then small - additional conversions afterwards to bring in new - changes. - - - - - Git + + Shows commands or other text that should be typed + literally by the user. + + - Git is a distributed revision control tool that was - developed for managing the Linux kernel source tree. Like - Mercurial, its early design was somewhat influenced by - Monotone. - - Git has a very large command set, with version 1.5.0 - providing 139 individual commands. It has something of a - reputation for being difficult to learn. Compared to Git, - Mercurial has a strong focus on simplicity. - - In terms of performance, Git is extremely fast. In - several cases, it is faster than Mercurial, at least on Linux, - while Mercurial performs better on other operations. However, - on Windows, the performance and general level of support that - Git provides is, at the time of writing, far behind that of - Mercurial. - - While a Mercurial repository needs no maintenance, a Git - repository requires frequent manual repacks of - its metadata. Without these, performance degrades, while - space usage grows rapidly. A server that contains many Git - repositories that are not rigorously and frequently repacked - will become heavily disk-bound during backups, and there have - been instances of daily backups taking far longer than 24 - hours as a result. A freshly packed Git repository is - slightly smaller than a Mercurial repository, but an unpacked - repository is several orders of magnitude larger. - - The core of Git is written in C. Many Git commands are - implemented as shell or Perl scripts, and the quality of these - scripts varies widely. I have encountered several instances - where scripts charged along blindly in the presence of errors - that should have been fatal. + + Constant width italic - Mercurial can import revision history from a Git - repository. - - - - - CVS - - CVS is probably the most widely used revision control tool - in the world. Due to its age and internal untidiness, it has - been only lightly maintained for many years. - - It has a centralised client/server architecture. It does - not group related file changes into atomic commits, making it - easy for people to break the build: one person - can successfully commit part of a change and then be blocked - by the need for a merge, causing other people to see only a - portion of the work they intended to do. This also affects - how you work with project history. If you want to see all of - the modifications someone made as part of a task, you will - need to manually inspect the descriptions and timestamps of - the changes made to each file involved (if you even know what - those files were). - - CVS has a muddled notion of tags and branches that I will - not attempt to even describe. It does not support renaming of - files or directories well, making it easy to corrupt a - repository. It has almost no internal consistency checking - capabilities, so it is usually not even possible to tell - whether or how a repository is corrupt. I would not recommend - CVS for any project, existing or new. - - Mercurial can import CVS revision history. However, there - are a few caveats that apply; these are true of every other - revision control tool's CVS importer, too. Due to CVS's lack - of atomic changes and unversioned filesystem hierarchy, it is - not possible to reconstruct CVS history completely accurately; - some guesswork is involved, and renames will usually not show - up. Because a lot of advanced CVS administration has to be - done by hand and is hence error-prone, it's common for CVS - importers to run into multiple problems with corrupted - repositories (completely bogus revision timestamps and files - that have remained locked for over a decade are just two of - the less interesting problems I can recall from personal - experience). - - Mercurial can import revision history from a CVS - repository. - - - - - Commercial tools + + Shows text that should be replaced with user-supplied + values or by values determined by context. + + + - Perforce has a centralised client/server architecture, - with no client-side caching of any data. Unlike modern - revision control tools, Perforce requires that a user run a - command to inform the server about every file they intend to - edit. - - The performance of Perforce is quite good for small teams, - but it falls off rapidly as the number of users grows beyond a - few dozen. Modestly large Perforce installations require the - deployment of proxies to cope with the load their users - generate. - - - - - Choosing a revision control tool - - With the exception of CVS, all of the tools listed above - have unique strengths that suit them to particular styles of - work. There is no single revision control tool that is best - in all situations. - - As an example, Subversion is a good choice for working - with frequently edited binary files, due to its centralised - nature and support for file locking. - - I personally find Mercurial's properties of simplicity, - performance, and good merge support to be a compelling - combination that has served me well for several years. - + + This icon signifies a tip, suggestion, or general + note. + - - - - Switching from another tool to Mercurial - - Mercurial is bundled with an extension named convert, which can incrementally - import revision history from several other revision control - tools. By incremental, I mean that you can - convert all of a project's history to date in one go, then rerun - the conversion later to obtain new changes that happened after - the initial conversion. - - The revision control tools supported by convert are as follows: - - Subversion - CVS - Git - Darcs - - In addition, convert can - export changes from Mercurial to Subversion. This makes it - possible to try Subversion and Mercurial in parallel before - committing to a switchover, without risking the loss of any - work. - - The convert command - is easy to use. Simply point it at the path or URL of the - source repository, optionally give it the name of the - destination repository, and it will start working. After the - initial conversion, just run the same command again to import - new changes. + + This icon indicates a warning or caution. + - A short history of revision control - - The best known of the old-time revision control tools is - SCCS (Source Code Control System), which Marc Rochkind wrote at - Bell Labs, in the early 1970s. SCCS operated on individual - files, and required every person working on a project to have - access to a shared workspace on a single system. Only one - person could modify a file at any time; arbitration for access - to files was via locks. It was common for people to lock files, - and later forget to unlock them, preventing anyone else from - modifying those files without the help of an - administrator. - - Walter Tichy developed a free alternative to SCCS in the - early 1980s; he called his program RCS (Revision Control System). - Like SCCS, RCS required developers to work in a single shared - workspace, and to lock files to prevent multiple people from - modifying them simultaneously. + Using Code Examples - Later in the 1980s, Dick Grune used RCS as a building block - for a set of shell scripts he initially called cmt, but then - renamed to CVS (Concurrent Versions System). The big innovation - of CVS was that it let developers work simultaneously and - somewhat independently in their own personal workspaces. The - personal workspaces prevented developers from stepping on each - other's toes all the time, as was common with SCCS and RCS. Each - developer had a copy of every project file, and could modify - their copies independently. They had to merge their edits prior - to committing changes to the central repository. + This book is here to help you get your job done. In general, + you may use the code in this book in your programs and + documentation. You do not need to contact us for permission + unless you’re reproducing a significant portion of the code. For + example, writing a program that uses several chunks of code from + this book does not require permission. Selling or distributing a + CD-ROM of examples from O’Reilly books does require permission. + Answering a question by citing this book and quoting example + code does not require permission. Incorporating a significant + amount of example code from this book into your product’s + documentation does require permission. - Brian Berliner took Grune's original scripts and rewrote - them in C, releasing in 1989 the code that has since developed - into the modern version of CVS. CVS subsequently acquired the - ability to operate over a network connection, giving it a - client/server architecture. CVS's architecture is centralised; - only the server has a copy of the history of the project. Client - workspaces just contain copies of recent versions of the - project's files, and a little metadata to tell them where the - server is. CVS has been enormously successful; it is probably - the world's most widely used revision control system. + We appreciate, but do not require, attribution. An + attribution usually includes the title, author, publisher, and + ISBN. For example: “Book Title by Some + Author. Copyright 2008 O’Reilly Media, Inc., + 978-0-596-xxxx-x.” - In the early 1990s, Sun Microsystems developed an early - distributed revision control system, called TeamWare. A - TeamWare workspace contains a complete copy of the project's - history. TeamWare has no notion of a central repository. (CVS - relied upon RCS for its history storage; TeamWare used - SCCS.) + If you feel your use of code examples falls outside fair use + or the permission given above, feel free to contact us at + permissions@oreilly.com. + - As the 1990s progressed, awareness grew of a number of - problems with CVS. It records simultaneous changes to multiple - files individually, instead of grouping them together as a - single logically atomic operation. It does not manage its file - hierarchy well; it is easy to make a mess of a repository by - renaming files and directories. Worse, its source code is - difficult to read and maintain, which made the pain - level of fixing these architectural problems - prohibitive. + + Safari® Books Online - In 2001, Jim Blandy and Karl Fogel, two developers who had - worked on CVS, started a project to replace it with a tool that - would have a better architecture and cleaner code. The result, - Subversion, does not stray from CVS's centralised client/server - model, but it adds multi-file atomic commits, better namespace - management, and a number of other features that make it a - generally better tool than CVS. Since its initial release, it - has rapidly grown in popularity. + + When you see a Safari® Books Online icon on the cover of + your favorite technology book, that means the book is + available online through the O’Reilly Network Safari + Bookshelf. + - More or less simultaneously, Graydon Hoare began working on - an ambitious distributed revision control system that he named - Monotone. While Monotone addresses many of CVS's design flaws - and has a peer-to-peer architecture, it goes beyond earlier (and - subsequent) revision control tools in a number of innovative - ways. It uses cryptographic hashes as identifiers, and has an - integral notion of trust for code from different - sources. - - Mercurial began life in 2005. While a few aspects of its - design are influenced by Monotone, Mercurial focuses on ease of - use, high performance, and scalability to very large - projects. - + Safari offers a solution that’s better than e-books. It’s a + virtual library that lets you easily search thousands of top + tech books, cut and paste code samples, download chapters, and + find quick answers when you need the most accurate, current + information. Try it for free at http://my.safaribooksonline.com. - Colophon&emdash;this book is Free + How to Contact Us + + Please address comments and questions concerning this book + to the publisher: + + + O’Reilly Media, Inc. + + 1005 Gravenstein Highway North + + Sebastopol, CA 95472 + + 800-998-9938 (in the United States or Canada) + + 707-829-0515 (international or local) + + 707 829-0104 (fax) + - This book is licensed under the Open Publication License, - and is produced entirely using Free Software tools. It is - typeset with DocBook XML. Illustrations are drawn and rendered with - Inkscape. + We have a web page for this book, where we list errata, + examples, and any additional information. You can access this + page at: + + + + + + Don’t forget to update the <url> attribute, + too. - The complete source code for this book is published as a - Mercurial repository, at http://hg.serpentine.com/mercurial/book. + To comment or ask technical questions about this book, send + email to: + + + bookquestions@oreilly.com + + For more information about our books, conferences, Resource + Centers, and the O’Reilly Network, see our web site at: + + + + + + + + + How did we get here? + + + Why revision control? Why Mercurial? + + Revision control is the process of managing multiple + versions of a piece of information. In its simplest form, this + is something that many people do by hand: every time you modify + a file, save it under a new name that contains a number, each + one higher than the number of the preceding version. + + Manually managing multiple versions of even a single file is + an error-prone task, though, so software tools to help automate + this process have long been available. The earliest automated + revision control tools were intended to help a single user to + manage revisions of a single file. Over the past few decades, + the scope of revision control tools has expanded greatly; they + now manage multiple files, and help multiple people to work + together. The best modern revision control tools have no + problem coping with thousands of people working together on + projects that consist of hundreds of thousands of files. + + The arrival of distributed revision control is relatively + recent, and so far this new field has grown due to people's + willingness to explore ill-charted territory. + + I am writing a book about distributed revision control + because I believe that it is an important subject that deserves + a field guide. I chose to write about Mercurial because it is + the easiest tool to learn the terrain with, and yet it scales to + the demands of real, challenging environments where many other + revision control tools buckle. + + + Why use revision control? + + There are a number of reasons why you or your team might + want to use an automated revision control tool for a + project. + + + It will track the history and evolution of + your project, so you don't have to. For every change, + you'll have a log of who made it; + why they made it; + when they made it; and + what the change + was. + When you're working with other people, + revision control software makes it easier for you to + collaborate. For example, when people more or less + simultaneously make potentially incompatible changes, the + software will help you to identify and resolve those + conflicts. + It can help you to recover from mistakes. If + you make a change that later turns out to be in error, you + can revert to an earlier version of one or more files. In + fact, a really good revision control + tool will even help you to efficiently figure out exactly + when a problem was introduced (see for details). + It will help you to work simultaneously on, + and manage the drift between, multiple versions of your + project. + + + Most of these reasons are equally + valid&emdash;at least in theory&emdash;whether you're working + on a project by yourself, or with a hundred other + people. + + A key question about the practicality of revision control + at these two different scales (lone hacker and + huge team) is how its + benefits compare to its + costs. A revision control tool that's + difficult to understand or use is going to impose a high + cost. + + A five-hundred-person project is likely to collapse under + its own weight almost immediately without a revision control + tool and process. In this case, the cost of using revision + control might hardly seem worth considering, since + without it, failure is almost + guaranteed. + + On the other hand, a one-person quick hack + might seem like a poor place to use a revision control tool, + because surely the cost of using one must be close to the + overall cost of the project. Right? + + Mercurial uniquely supports both of + these scales of development. You can learn the basics in just + a few minutes, and due to its low overhead, you can apply + revision control to the smallest of projects with ease. Its + simplicity means you won't have a lot of abstruse concepts or + command sequences competing for mental space with whatever + you're really trying to do. At the same + time, Mercurial's high performance and peer-to-peer nature let + you scale painlessly to handle large projects. + + No revision control tool can rescue a poorly run project, + but a good choice of tools can make a huge difference to the + fluidity with which you can work on a project. + + + + + The many names of revision control + + Revision control is a diverse field, so much so that it is + referred to by many names and acronyms. Here are a few of the + more common variations you'll encounter: + + Revision control (RCS) + Software configuration management (SCM), or + configuration management + Source code management + Source code control, or source + control + Version control + (VCS) + Some people claim that these terms actually have different + meanings, but in practice they overlap so much that there's no + agreed or even useful way to tease them apart. + + + + + + About the examples in this book + + This book takes an unusual approach to code samples. Every + example is live&emdash;each one is actually the result + of a shell script that executes the Mercurial commands you see. + Every time an image of the book is built from its sources, all + the example scripts are automatically run, and their current + results compared against their expected results. + + The advantage of this approach is that the examples are + always accurate; they describe exactly the + behavior of the version of Mercurial that's mentioned at the + front of the book. If I update the version of Mercurial that + I'm documenting, and the output of some command changes, the + build fails. + + There is a small disadvantage to this approach, which is + that the dates and times you'll see in examples tend to be + squashed together in a way that they wouldn't be + if the same commands were being typed by a human. Where a human + can issue no more than one command every few seconds, with any + resulting timestamps correspondingly spread out, my automated + example scripts run many commands in one second. + + As an instance of this, several consecutive commits in an + example can show up as having occurred during the same second. + You can see this occur in the bisect example in , for instance. + + So when you're reading examples, don't place too much weight + on the dates or times you see in the output of commands. But + do be confident that the behavior you're + seeing is consistent and reproducible. + + + + + Trends in the field + + There has been an unmistakable trend in the development and + use of revision control tools over the past four decades, as + people have become familiar with the capabilities of their tools + and constrained by their limitations. + + The first generation began by managing single files on + individual computers. Although these tools represented a huge + advance over ad-hoc manual revision control, their locking model + and reliance on a single computer limited them to small, + tightly-knit teams. + + The second generation loosened these constraints by moving + to network-centered architectures, and managing entire projects + at a time. As projects grew larger, they ran into new problems. + With clients needing to talk to servers very frequently, server + scaling became an issue for large projects. An unreliable + network connection could prevent remote users from being able to + talk to the server at all. As open source projects started + making read-only access available anonymously to anyone, people + without commit privileges found that they could not use the + tools to interact with a project in a natural way, as they could + not record their changes. + + The current generation of revision control tools is + peer-to-peer in nature. All of these systems have dropped the + dependency on a single central server, and allow people to + distribute their revision control data to where it's actually + needed. Collaboration over the Internet has moved from + constrained by technology to a matter of choice and consensus. + Modern tools can operate offline indefinitely and autonomously, + with a network connection only needed when syncing changes with + another repository. + + + + A few of the advantages of distributed revision + control + + Even though distributed revision control tools have for + several years been as robust and usable as their + previous-generation counterparts, people using older tools have + not yet necessarily woken up to their advantages. There are a + number of ways in which distributed tools shine relative to + centralised ones. + + For an individual developer, distributed tools are almost + always much faster than centralised tools. This is for a simple + reason: a centralised tool needs to talk over the network for + many common operations, because most metadata is stored in a + single copy on the central server. A distributed tool stores + all of its metadata locally. All else being equal, talking over + the network adds overhead to a centralised tool. Don't + underestimate the value of a snappy, responsive tool: you're + going to spend a lot of time interacting with your revision + control software. + + Distributed tools are indifferent to the vagaries of your + server infrastructure, again because they replicate metadata to + so many locations. If you use a centralised system and your + server catches fire, you'd better hope that your backup media + are reliable, and that your last backup was recent and actually + worked. With a distributed tool, you have many backups + available on every contributor's computer. + + The reliability of your network will affect distributed + tools far less than it will centralised tools. You can't even + use a centralised tool without a network connection, except for + a few highly constrained commands. With a distributed tool, if + your network connection goes down while you're working, you may + not even notice. The only thing you won't be able to do is talk + to repositories on other computers, something that is relatively + rare compared with local operations. If you have a far-flung + team of collaborators, this may be significant. + + + Advantages for open source projects + + If you take a shine to an open source project and decide + that you would like to start hacking on it, and that project + uses a distributed revision control tool, you are at once a + peer with the people who consider themselves the + core of that project. If they publish their + repositories, you can immediately copy their project history, + start making changes, and record your work, using the same + tools in the same ways as insiders. By contrast, with a + centralised tool, you must use the software in a read + only mode unless someone grants you permission to + commit changes to their central server. Until then, you won't + be able to record changes, and your local modifications will + be at risk of corruption any time you try to update your + client's view of the repository. + + + The forking non-problem + + It has been suggested that distributed revision control + tools pose some sort of risk to open source projects because + they make it easy to fork the development of + a project. A fork happens when there are differences in + opinion or attitude between groups of developers that cause + them to decide that they can't work together any longer. + Each side takes a more or less complete copy of the + project's source code, and goes off in its own + direction. + + Sometimes the camps in a fork decide to reconcile their + differences. With a centralised revision control system, the + technical process of reconciliation is + painful, and has to be performed largely by hand. You have + to decide whose revision history is going to + win, and graft the other team's changes into + the tree somehow. This usually loses some or all of one + side's revision history. + + What distributed tools do with respect to forking is + they make forking the only way to + develop a project. Every single change that you make is + potentially a fork point. The great strength of this + approach is that a distributed revision control tool has to + be really good at merging forks, + because forks are absolutely fundamental: they happen all + the time. + + If every piece of work that everybody does, all the + time, is framed in terms of forking and merging, then what + the open source world refers to as a fork + becomes purely a social issue. If + anything, distributed tools lower the + likelihood of a fork: + + They eliminate the social distinction that + centralised tools impose: that between insiders (people + with commit access) and outsiders (people + without). + They make it easier to reconcile after a + social fork, because all that's involved from the + perspective of the revision control software is just + another merge. + + Some people resist distributed tools because they want + to retain tight control over their projects, and they + believe that centralised tools give them this control. + However, if you're of this belief, and you publish your CVS + or Subversion repositories publicly, there are plenty of + tools available that can pull out your entire project's + history (albeit slowly) and recreate it somewhere that you + don't control. So while your control in this case is + illusory, you are forgoing the ability to fluidly + collaborate with whatever people feel compelled to mirror + and fork your history. + + + + + Advantages for commercial projects + + Many commercial projects are undertaken by teams that are + scattered across the globe. Contributors who are far from a + central server will see slower command execution and perhaps + less reliability. Commercial revision control systems attempt + to ameliorate these problems with remote-site replication + add-ons that are typically expensive to buy and cantankerous + to administer. A distributed system doesn't suffer from these + problems in the first place. Better yet, you can easily set + up multiple authoritative servers, say one per site, so that + there's no redundant communication between repositories over + expensive long-haul network links. + + Centralised revision control systems tend to have + relatively low scalability. It's not unusual for an expensive + centralised system to fall over under the combined load of + just a few dozen concurrent users. Once again, the typical + response tends to be an expensive and clunky replication + facility. Since the load on a central server&emdash;if you have + one at all&emdash;is many times lower with a distributed tool + (because all of the data is replicated everywhere), a single + cheap server can handle the needs of a much larger team, and + replication to balance load becomes a simple matter of + scripting. + + If you have an employee in the field, troubleshooting a + problem at a customer's site, they'll benefit from distributed + revision control. The tool will let them generate custom + builds, try different fixes in isolation from each other, and + search efficiently through history for the sources of bugs and + regressions in the customer's environment, all without needing + to connect to your company's network. + + + + + Why choose Mercurial? + + Mercurial has a unique set of properties that make it a + particularly good choice as a revision control system. + + It is easy to learn and use. + It is lightweight. + It scales excellently. + It is easy to + customise. + + If you are at all familiar with revision control systems, + you should be able to get up and running with Mercurial in less + than five minutes. Even if not, it will take no more than a few + minutes longer. Mercurial's command and feature sets are + generally uniform and consistent, so you can keep track of a few + general rules instead of a host of exceptions. + + On a small project, you can start working with Mercurial in + moments. Creating new changes and branches; transferring changes + around (whether locally or over a network); and history and + status operations are all fast. Mercurial attempts to stay + nimble and largely out of your way by combining low cognitive + overhead with blazingly fast operations. + + The usefulness of Mercurial is not limited to small + projects: it is used by projects with hundreds to thousands of + contributors, each containing tens of thousands of files and + hundreds of megabytes of source code. + + If the core functionality of Mercurial is not enough for + you, it's easy to build on. Mercurial is well suited to + scripting tasks, and its clean internals and implementation in + Python make it easy to add features in the form of extensions. + There are a number of popular and useful extensions already + available, ranging from helping to identify bugs to improving + performance. + + + + Mercurial compared with other tools + + Before you read on, please understand that this section + necessarily reflects my own experiences, interests, and (dare I + say it) biases. I have used every one of the revision control + tools listed below, in most cases for several years at a + time. + + + + Subversion + + Subversion is a popular revision control tool, developed + to replace CVS. It has a centralised client/server + architecture. + + Subversion and Mercurial have similarly named commands for + performing the same operations, so if you're familiar with + one, it is easy to learn to use the other. Both tools are + portable to all popular operating systems. + + Prior to version 1.5, Subversion had no useful support for + merges. At the time of writing, its merge tracking capability + is new, and known to be complicated + and buggy. + + Mercurial has a substantial performance advantage over + Subversion on every revision control operation I have + benchmarked. I have measured its advantage as ranging from a + factor of two to a factor of six when compared with Subversion + 1.4.3's ra_local file store, which is the + fastest access method available. In more realistic + deployments involving a network-based store, Subversion will + be at a substantially larger disadvantage. Because many + Subversion commands must talk to the server and Subversion + does not have useful replication facilities, server capacity + and network bandwidth become bottlenecks for modestly large + projects. + + Additionally, Subversion incurs substantial storage + overhead to avoid network transactions for a few common + operations, such as finding modified files + (status) and displaying modifications + against the current revision (diff). As a + result, a Subversion working copy is often the same size as, + or larger than, a Mercurial repository and working directory, + even though the Mercurial repository contains a complete + history of the project. + + Subversion is widely supported by third party tools. + Mercurial currently lags considerably in this area. This gap + is closing, however, and indeed some of Mercurial's GUI tools + now outshine their Subversion equivalents. Like Mercurial, + Subversion has an excellent user manual. + + Because Subversion doesn't store revision history on the + client, it is well suited to managing projects that deal with + lots of large, opaque binary files. If you check in fifty + revisions to an incompressible 10MB file, Subversion's + client-side space usage stays constant The space used by any + distributed SCM will grow rapidly in proportion to the number + of revisions, because the differences between each revision + are large. + + In addition, it's often difficult or, more usually, + impossible to merge different versions of a binary file. + Subversion's ability to let a user lock a file, so that they + temporarily have the exclusive right to commit changes to it, + can be a significant advantage to a project where binary files + are widely used. + + Mercurial can import revision history from a Subversion + repository. It can also export revision history to a + Subversion repository. This makes it easy to test the + waters and use Mercurial and Subversion in parallel + before deciding to switch. History conversion is incremental, + so you can perform an initial conversion, then small + additional conversions afterwards to bring in new + changes. + + + + + Git + + Git is a distributed revision control tool that was + developed for managing the Linux kernel source tree. Like + Mercurial, its early design was somewhat influenced by + Monotone. + + Git has a very large command set, with version 1.5.0 + providing 139 individual commands. It has something of a + reputation for being difficult to learn. Compared to Git, + Mercurial has a strong focus on simplicity. + + In terms of performance, Git is extremely fast. In + several cases, it is faster than Mercurial, at least on Linux, + while Mercurial performs better on other operations. However, + on Windows, the performance and general level of support that + Git provides is, at the time of writing, far behind that of + Mercurial. + + While a Mercurial repository needs no maintenance, a Git + repository requires frequent manual repacks of + its metadata. Without these, performance degrades, while + space usage grows rapidly. A server that contains many Git + repositories that are not rigorously and frequently repacked + will become heavily disk-bound during backups, and there have + been instances of daily backups taking far longer than 24 + hours as a result. A freshly packed Git repository is + slightly smaller than a Mercurial repository, but an unpacked + repository is several orders of magnitude larger. + + The core of Git is written in C. Many Git commands are + implemented as shell or Perl scripts, and the quality of these + scripts varies widely. I have encountered several instances + where scripts charged along blindly in the presence of errors + that should have been fatal. + + Mercurial can import revision history from a Git + repository. + + + + + CVS + + CVS is probably the most widely used revision control tool + in the world. Due to its age and internal untidiness, it has + been only lightly maintained for many years. + + It has a centralised client/server architecture. It does + not group related file changes into atomic commits, making it + easy for people to break the build: one person + can successfully commit part of a change and then be blocked + by the need for a merge, causing other people to see only a + portion of the work they intended to do. This also affects + how you work with project history. If you want to see all of + the modifications someone made as part of a task, you will + need to manually inspect the descriptions and timestamps of + the changes made to each file involved (if you even know what + those files were). + + CVS has a muddled notion of tags and branches that I will + not attempt to even describe. It does not support renaming of + files or directories well, making it easy to corrupt a + repository. It has almost no internal consistency checking + capabilities, so it is usually not even possible to tell + whether or how a repository is corrupt. I would not recommend + CVS for any project, existing or new. + + Mercurial can import CVS revision history. However, there + are a few caveats that apply; these are true of every other + revision control tool's CVS importer, too. Due to CVS's lack + of atomic changes and unversioned filesystem hierarchy, it is + not possible to reconstruct CVS history completely accurately; + some guesswork is involved, and renames will usually not show + up. Because a lot of advanced CVS administration has to be + done by hand and is hence error-prone, it's common for CVS + importers to run into multiple problems with corrupted + repositories (completely bogus revision timestamps and files + that have remained locked for over a decade are just two of + the less interesting problems I can recall from personal + experience). + + Mercurial can import revision history from a CVS + repository. + + + + + Commercial tools + + Perforce has a centralised client/server architecture, + with no client-side caching of any data. Unlike modern + revision control tools, Perforce requires that a user run a + command to inform the server about every file they intend to + edit. + + The performance of Perforce is quite good for small teams, + but it falls off rapidly as the number of users grows beyond a + few dozen. Modestly large Perforce installations require the + deployment of proxies to cope with the load their users + generate. + + + + + Choosing a revision control tool + + With the exception of CVS, all of the tools listed above + have unique strengths that suit them to particular styles of + work. There is no single revision control tool that is best + in all situations. + + As an example, Subversion is a good choice for working + with frequently edited binary files, due to its centralised + nature and support for file locking. + + I personally find Mercurial's properties of simplicity, + performance, and good merge support to be a compelling + combination that has served me well for several years. + + + + + + Switching from another tool to Mercurial + + Mercurial is bundled with an extension named convert, which can incrementally + import revision history from several other revision control + tools. By incremental, I mean that you can + convert all of a project's history to date in one go, then rerun + the conversion later to obtain new changes that happened after + the initial conversion. + + The revision control tools supported by convert are as follows: + + Subversion + CVS + Git + Darcs + + In addition, convert can + export changes from Mercurial to Subversion. This makes it + possible to try Subversion and Mercurial in parallel before + committing to a switchover, without risking the loss of any + work. + + The convert command + is easy to use. Simply point it at the path or URL of the + source repository, optionally give it the name of the + destination repository, and it will start working. After the + initial conversion, just run the same command again to import + new changes. + + + + A short history of revision control + + The best known of the old-time revision control tools is + SCCS (Source Code Control System), which Marc Rochkind wrote at + Bell Labs, in the early 1970s. SCCS operated on individual + files, and required every person working on a project to have + access to a shared workspace on a single system. Only one + person could modify a file at any time; arbitration for access + to files was via locks. It was common for people to lock files, + and later forget to unlock them, preventing anyone else from + modifying those files without the help of an + administrator. + + Walter Tichy developed a free alternative to SCCS in the + early 1980s; he called his program RCS (Revision Control System). + Like SCCS, RCS required developers to work in a single shared + workspace, and to lock files to prevent multiple people from + modifying them simultaneously. + + Later in the 1980s, Dick Grune used RCS as a building block + for a set of shell scripts he initially called cmt, but then + renamed to CVS (Concurrent Versions System). The big innovation + of CVS was that it let developers work simultaneously and + somewhat independently in their own personal workspaces. The + personal workspaces prevented developers from stepping on each + other's toes all the time, as was common with SCCS and RCS. Each + developer had a copy of every project file, and could modify + their copies independently. They had to merge their edits prior + to committing changes to the central repository. + + Brian Berliner took Grune's original scripts and rewrote + them in C, releasing in 1989 the code that has since developed + into the modern version of CVS. CVS subsequently acquired the + ability to operate over a network connection, giving it a + client/server architecture. CVS's architecture is centralised; + only the server has a copy of the history of the project. Client + workspaces just contain copies of recent versions of the + project's files, and a little metadata to tell them where the + server is. CVS has been enormously successful; it is probably + the world's most widely used revision control system. + + In the early 1990s, Sun Microsystems developed an early + distributed revision control system, called TeamWare. A + TeamWare workspace contains a complete copy of the project's + history. TeamWare has no notion of a central repository. (CVS + relied upon RCS for its history storage; TeamWare used + SCCS.) + + As the 1990s progressed, awareness grew of a number of + problems with CVS. It records simultaneous changes to multiple + files individually, instead of grouping them together as a + single logically atomic operation. It does not manage its file + hierarchy well; it is easy to make a mess of a repository by + renaming files and directories. Worse, its source code is + difficult to read and maintain, which made the pain + level of fixing these architectural problems + prohibitive. + + In 2001, Jim Blandy and Karl Fogel, two developers who had + worked on CVS, started a project to replace it with a tool that + would have a better architecture and cleaner code. The result, + Subversion, does not stray from CVS's centralised client/server + model, but it adds multi-file atomic commits, better namespace + management, and a number of other features that make it a + generally better tool than CVS. Since its initial release, it + has rapidly grown in popularity. + + More or less simultaneously, Graydon Hoare began working on + an ambitious distributed revision control system that he named + Monotone. While Monotone addresses many of CVS's design flaws + and has a peer-to-peer architecture, it goes beyond earlier (and + subsequent) revision control tools in a number of innovative + ways. It uses cryptographic hashes as identifiers, and has an + integral notion of trust for code from different + sources. + + Mercurial began life in 2005. While a few aspects of its + design are influenced by Monotone, Mercurial focuses on ease of + use, high performance, and scalability to very large + projects. + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch01-tour-basic.xml --- a/en/ch01-tour-basic.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,930 +0,0 @@ - - - - - A tour of Mercurial: the basics - - - Installing Mercurial on your system - - Prebuilt binary packages of Mercurial are available for - every popular operating system. These make it easy to start - using Mercurial on your computer immediately. - - - Windows - - The best version of Mercurial for Windows is - TortoiseHg, which can be found at http://bitbucket.org/tortoisehg/stable/wiki/Home. - This package has no external dependencies; it just - works. It provides both command line and graphical - user interfaces. - - - - - Mac OS X - - Lee Cantey publishes an installer of Mercurial - for Mac OS X at http://mercurial.berkwood.com. - - - - Linux - - Because each Linux distribution has its own packaging - tools, policies, and rate of development, it's difficult to - give a comprehensive set of instructions on how to install - Mercurial binaries. The version of Mercurial that you will - end up with can vary depending on how active the person is who - maintains the package for your distribution. - - To keep things simple, I will focus on installing - Mercurial from the command line under the most popular Linux - distributions. Most of these distributions provide graphical - package managers that will let you install Mercurial with a - single click; the package name to look for is - mercurial. - - - Ubuntu and Debian: - apt-get install mercurial - Fedora and OpenSUSE: - yum install mercurial - Gentoo: - emerge mercurial - - - - - Solaris - - SunFreeWare, at http://www.sunfreeware.com, - provides prebuilt packages of Mercurial. - - - - - - - Getting started - - To begin, we'll use the hg - version command to find out whether Mercurial is - actually installed properly. The actual version information - that it prints isn't so important; it's whether it prints - anything at all that we care about. - - &interaction.tour.version; - - - Built-in help - - Mercurial provides a built-in help system. This is - invaluable for those times when you find yourself stuck - trying to remember how to run a command. If you are - completely stuck, simply run hg - help; it will print a brief list of commands, - along with a description of what each does. If you ask for - help on a specific command (as below), it prints more - detailed information. - - &interaction.tour.help; - - For a more impressive level of detail (which you won't - usually need) run hg help . The option is short for - , and tells - Mercurial to print more information than it usually - would. - - - - - Working with a repository - - In Mercurial, everything happens inside a - repository. The repository for a project - contains all of the files that belong to that - project, along with a historical record of the project's - files. - - There's nothing particularly magical about a repository; it - is simply a directory tree in your filesystem that Mercurial - treats as special. You can rename or delete a repository any - time you like, using either the command line or your file - browser. - - - Making a local copy of a repository - - Copying a repository is just a little - bit special. While you could use a normal file copying - command to make a copy of a repository, it's best to use a - built-in command that Mercurial provides. This command is - called hg clone, because it - makes an identical copy of an existing repository. - - &interaction.tour.clone; - - One advantage of using hg - clone is that, as we can see above, it lets us clone - repositories over the network. Another is that it remembers - where we cloned from, which we'll find useful soon when we - want to fetch new changes from another repository. - - If our clone succeeded, we should now have a local - directory called hello. - This directory will contain some files. - - &interaction.tour.ls; - - These files have the same contents and history in our - repository as they do in the repository we cloned. - - Every Mercurial repository is complete, - self-contained, and independent. It contains its own private - copy of a project's files and history. As we just mentioned, - a cloned repository remembers the location of the repository - it was cloned from, but Mercurial will not communicate with - that repository, or any other, unless you tell it to. - - What this means for now is that we're free to experiment - with our repository, safe in the knowledge that it's a private - sandbox that won't affect anyone else. - - - - What's in a repository? - - When we take a more detailed look inside a repository, we - can see that it contains a directory named .hg. This is where Mercurial - keeps all of its metadata for the repository. - - &interaction.tour.ls-a; - - The contents of the .hg directory and its - subdirectories are private to Mercurial. Every other file and - directory in the repository is yours to do with as you - please. - - To introduce a little terminology, the .hg directory is the - real repository, and all of the files and - directories that coexist with it are said to live in the - working directory. An easy way to - remember the distinction is that the - repository contains the - history of your project, while the - working directory contains a - snapshot of your project at a particular - point in history. - - - - - A tour through history - - One of the first things we might want to do with a new, - unfamiliar repository is understand its history. The hg log command gives us a view of - the history of changes in the repository. - - &interaction.tour.log; - - By default, this command prints a brief paragraph of output - for each change to the project that was recorded. In Mercurial - terminology, we call each of these recorded events a - changeset, because it can contain a record - of changes to several files. - - The fields in a record of output from hg log are as follows. - - - changeset: This - field has the format of a number, followed by a colon, - followed by a hexadecimal (or hex) - string. These are identifiers for the - changeset. The hex string is a unique identifier: the same - hex string will always refer to the same changeset. The - number is shorter and easier to type than the hex string, - but it isn't unique: the same number in two different clones - of a repository may identify different changesets. Why - provide the number at all, then? For local - convenience. - - user: The identity of the - person who created the changeset. This is a free-form - field, but it most often contains a person's name and email - address. - date: The date and time on - which the changeset was created, and the timezone in which - it was created. (The date and time are local to that - timezone; they display what time and date it was for the - person who created the changeset.) - summary: The first line of - the text message that the creator of the changeset entered - to describe the changeset. - - Some changesets, such as the first in the list above, - have a tag field. A tag is another way - to identify a changeset, by giving it an easy-to-remember - name. (The tag named tip is special: it - always refers to the newest change in a repository.) - - - - The default output printed by hg log is purely a summary; it is - missing a lot of detail. - - provides - a graphical representation of the history of the hello repository, to make it a - little easier to see which direction history is - flowing in. We'll be returning to this figure - several times in this chapter and the chapter that - follows. - -
- Graphical history of the <filename - class="directory">hello</filename> repository - - - XXX add text - -
- - - Changesets, revisions, and talking to other - people - - As English is a notoriously sloppy language, and computer - science has a hallowed history of terminological confusion - (why use one term when four will do?), revision control has a - variety of words and phrases that mean the same thing. If you - are talking about Mercurial history with other people, you - will find that the word changeset is often - compressed to change or (when written) - cset, and sometimes a changeset is referred to - as a revision or a rev. - - While it doesn't matter what word you - use to refer to the concept of a changeset, the - identifier that you use to refer to - a specific changeset is of - great importance. Recall that the changeset - field in the output from hg - log identifies a changeset using both a number and - a hexadecimal string. - - The revision number is a handy - notation that is only valid in that - repository. - The hexadecimal string is the - permanent, unchanging identifier that - will always identify that exact changeset in - every copy of the - repository. - - This distinction is important. If you send - someone an email talking about revision 33, - there's a high likelihood that their revision 33 will - not be the same as yours. The reason for - this is that a revision number depends on the order in which - changes arrived in a repository, and there is no guarantee - that the same changes will happen in the same order in - different repositories. Three changes a,b,c - can easily appear in one repository as - 0,1,2, while in another as - 0,2,1. - - Mercurial uses revision numbers purely as a convenient - shorthand. If you need to discuss a changeset with someone, - or make a record of a changeset for some other reason (for - example, in a bug report), use the hexadecimal - identifier. - - - - Viewing specific revisions - - To narrow the output of hg - log down to a single revision, use the (or ) option. You can use - either a revision number or a hexadecimal identifier, - and you can provide as many revisions as you want. - - &interaction.tour.log-r; - - If you want to see the history of several revisions - without having to list each one, you can use range - notation; this lets you express the idea I - want all revisions between abc and - def, inclusive. - - &interaction.tour.log.range; - - Mercurial also honours the order in which you specify - revisions, so hg log -r 2:4 - prints 2, 3, and 4. while hg log -r - 4:2 prints 4, 3, and 2. - - - - More detailed information - - While the summary information printed by hg log is useful if you already know - what you're looking for, you may need to see a complete - description of the change, or a list of the files changed, if - you're trying to decide whether a changeset is the one you're - looking for. The hg log - command's (or ) option gives you - this extra detail. - - &interaction.tour.log-v; - - If you want to see both the description and - content of a change, add the (or ) option. This displays - the content of a change as a unified diff - (if you've never seen a unified diff before, see for an overview). - - &interaction.tour.log-vp; - - The option is - tremendously useful, so it's well worth remembering. - - -
- - - All about command options - - Let's take a brief break from exploring Mercurial commands - to discuss a pattern in the way that they work; you may find - this useful to keep in mind as we continue our tour. - - Mercurial has a consistent and straightforward approach to - dealing with the options that you can pass to commands. It - follows the conventions for options that are common to modern - Linux and Unix systems. - - - - Every option has a long name. For example, as - we've already seen, the hg - log command accepts a option. - - - Most options have short names, too. Instead - of , we can use - . (The reason that - some options don't have short names is that the options in - question are rarely used.) - - - Long options start with two dashes (e.g. - ), while short - options start with one (e.g. ). - - - Option naming and usage is consistent across - commands. For example, every command that lets you specify - a changeset ID or revision number accepts both and arguments. - - - If you are using short options, you can save typing by - running them together. For example, the command hg log -v -p -r 2 can be written - as hg log -vpr2. - - - - In the examples throughout this book, I use short options - instead of long. This just reflects my own preference, so don't - read anything significant into it. - - Most commands that print output of some kind will print more - output when passed a - (or ) option, and - less when passed (or - ). - - - Option naming consistency - - Almost always, Mercurial commands use consistent option - names to refer to the same concepts. For instance, if a - command deals with changesets, you'll always identify them - with or . This consistent use of - option names makes it easier to remember what options a - particular command takes. - - - - - Making and reviewing changes - - Now that we have a grasp of viewing history in Mercurial, - let's take a look at making some changes and examining - them. - - The first thing we'll do is isolate our experiment in a - repository of its own. We use the hg - clone command, but we don't need to clone a copy of - the remote repository. Since we already have a copy of it - locally, we can just clone that instead. This is much faster - than cloning over the network, and cloning a local repository - uses less disk space in most cases, too - The saving of space arises when source and destination - repositories are on the same filesystem, in which case - Mercurial will use hardlinks to do copy-on-write sharing of - its internal metadata. If that explanation meant nothing to - you, don't worry: everything happens transparently and - automatically, and you don't need to understand it. - . - - &interaction.tour.reclone; - - As an aside, it's often good practice to keep a - pristine copy of a remote repository around, - which you can then make temporary clones of to create sandboxes - for each task you want to work on. This lets you work on - multiple tasks in parallel, each isolated from the others until - it's complete and you're ready to integrate it back. Because - local clones are so cheap, there's almost no overhead to cloning - and destroying repositories whenever you want. - - In our my-hello - repository, we have a file hello.c that - contains the classic hello, world program. - - &interaction.tour.cat1; - - Let's edit this file so that it prints a second line of - output. - - &interaction.tour.cat2; - - Mercurial's hg status - command will tell us what Mercurial knows about the files in the - repository. - - &interaction.tour.status; - - The hg status command - prints no output for some files, but a line starting with - M for - hello.c. Unless you tell it to, hg status will not print any output - for files that have not been modified. - - The M indicates that - Mercurial has noticed that we modified - hello.c. We didn't need to - inform Mercurial that we were going to - modify the file before we started, or that we had modified the - file after we were done; it was able to figure this out - itself. - - It's somewhat helpful to know that we've modified - hello.c, but we might prefer to know - exactly what changes we've made to it. To - do this, we use the hg diff - command. - - &interaction.tour.diff; - - - Understanding patches - - Remember to take a look at if you don't know how to read - output above. - - - - Recording changes in a new changeset - - We can modify files, build and test our changes, and use - hg status and hg diff to review our changes, until - we're satisfied with what we've done and arrive at a natural - stopping point where we want to record our work in a new - changeset. - - The hg commit command lets - us create a new changeset; we'll usually refer to this as - making a commit or - committing. - - - Setting up a username - - When you try to run hg - commit for the first time, it is not guaranteed to - succeed. Mercurial records your name and address with each - change that you commit, so that you and others will later be - able to tell who made each change. Mercurial tries to - automatically figure out a sensible username to commit the - change with. It will attempt each of the following methods, - in order: - - If you specify a option to the hg commit command on the command - line, followed by a username, this is always given the - highest precedence. - If you have set the HGUSER - environment variable, this is checked - next. - If you create a file in your home - directory called .hgrc, with a username entry, that will be - used next. To see what the contents of this file should - look like, refer to - below. - If you have set the EMAIL - environment variable, this will be used - next. - Mercurial will query your system to find out - your local user name and host name, and construct a - username from these components. Since this often results - in a username that is not very useful, it will print a - warning if it has to do - this. - - If all of these mechanisms fail, Mercurial will - fail, printing an error message. In this case, it will not - let you commit until you set up a - username. - You should think of the HGUSER environment - variable and the - option to the hg commit - command as ways to override Mercurial's - default selection of username. For normal use, the simplest - and most robust way to set a username for yourself is by - creating a .hgrc file; see - below for details. - - Creating a Mercurial configuration file - - To set a user name, use your favorite editor - to create a file called .hgrc in your home directory. - Mercurial will use this file to look up your personalised - configuration settings. The initial contents of your - .hgrc should look like - this. - - Figure out what the appropriate directory is on - Windows. - - # This is a Mercurial configuration file. -[ui] -username = Firstname Lastname <email.address@domain.net> - - The [ui] line begins a - section of the config file, so you can - read the username = ... - line as meaning set the value of the - username item in the - ui section. A section continues - until a new section begins, or the end of the file. - Mercurial ignores empty lines and treats any text from - # to the end of a line as - a comment. - - - - Choosing a user name - - You can use any text you like as the value of - the username config item, since this - information is for reading by other people, but will not be - interpreted by Mercurial. The convention that most - people follow is to use their name and email address, as - in the example above. - - Mercurial's built-in web server obfuscates - email addresses, to make it more difficult for the email - harvesting tools that spammers use. This reduces the - likelihood that you'll start receiving more junk email - if you publish a Mercurial repository on the - web. - - - - - Writing a commit message - - When we commit a change, Mercurial drops us into - a text editor, to enter a message that will describe the - modifications we've made in this changeset. This is called - the commit message. It will be a - record for readers of what we did and why, and it will be - printed by hg log after - we've finished committing. - - &interaction.tour.commit; - - The editor that the hg - commit command drops us into will contain an - empty line or two, followed by a number of lines starting with - HG:. - - -This is where I type my commit comment. - -HG: Enter commit message. Lines beginning with 'HG:' are removed. -HG: -- -HG: user: Bryan O'Sullivan <bos@serpentine.com> -HG: branch 'default' -HG: changed hello.c - - Mercurial ignores the lines that start with - HG:; it uses them only to - tell us which files it's recording changes to. Modifying or - deleting these lines has no effect. - - - Writing a good commit message - - Since hg log - only prints the first line of a commit message by default, - it's best to write a commit message whose first line stands - alone. Here's a real example of a commit message that - doesn't follow this guideline, and - hence has a summary that is not - readable. - - -changeset: 73:584af0e231be -user: Censored Person <censored.person@example.org> -date: Tue Sep 26 21:37:07 2006 -0700 -summary: include buildmeister/commondefs. Add exports. - - As far as the remainder of the contents of the - commit message are concerned, there are no hard-and-fast - rules. Mercurial itself doesn't interpret or care about the - contents of the commit message, though your project may have - policies that dictate a certain kind of - formatting. - My personal preference is for short, but - informative, commit messages that tell me something that I - can't figure out with a quick glance at the output of - hg log - --patch. - - - Aborting a commit - - If you decide that you don't want to commit - while in the middle of editing a commit message, simply exit - from your editor without saving the file that it's editing. - This will cause nothing to happen to either the repository - or the working directory. - If we run the hg - commit command without any arguments, it records - all of the changes we've made, as reported by hg status and hg diff. - - - Admiring our new handiwork - - Once we've finished the commit, we can use the - hg tip command to display - the changeset we just created. This command produces output - that is identical to hg - log, but it only displays the newest revision in - the repository. - - &interaction.tour.tip; - - We refer to the newest revision in the - repository as the tip revision, or simply - the tip. - - By the way, the hg tip - command accepts many of the same options as hg log, so above indicates be - verbose, - specifies print a patch. The use of to print patches is another - example of the consistent naming we mentioned earlier. - - - - - Sharing changes - - We mentioned earlier that repositories in - Mercurial are self-contained. This means that the changeset - we just created exists only in our my-hello repository. Let's - look at a few ways that we can propagate this change into - other repositories. - - - Pulling changes from another repository - To get started, let's clone our original - hello repository, - which does not contain the change we just committed. We'll - call our temporary repository hello-pull. - - &interaction.tour.clone-pull; - - We'll use the hg - pull command to bring changes from my-hello into hello-pull. However, blindly - pulling unknown changes into a repository is a somewhat - scary prospect. Mercurial provides the hg incoming command to tell us - what changes the hg pull - command would pull into the repository, - without actually pulling the changes in. - - &interaction.tour.incoming; - - Suppose you're pulling changes from a repository - on the network somewhere. While you are looking at the hg incoming output, and before you - pull those changes, someone might have committed something in - the remote repository. This means that it's possible to pull - more changes than you saw when using hg incoming. - - Bringing changes into a repository is a simple - matter of running the hg - pull command, and telling it which repository to - pull from. - - &interaction.tour.pull; - - As you can see - from the before-and-after output of hg tip, we have successfully - pulled changes into our repository. There remains one step - before we can see these changes in the working - directory. - - - Updating the working directory - - We have so far glossed over the relationship - between a repository and its working directory. The hg pull command that we ran in - brought changes into the - repository, but if we check, there's no sign of those changes - in the working directory. This is because hg pull does not (by default) touch - the working directory. Instead, we use the hg update command to do this. - - &interaction.tour.update; - - It might seem a bit strange that hg - pull doesn't update the working directory - automatically. There's actually a good reason for this: you - can use hg update to update - the working directory to the state it was in at any - revision in the history of the repository. If - you had the working directory updated to an old revision&emdash;to - hunt down the origin of a bug, say&emdash;and ran a hg pull which automatically updated - the working directory to a new revision, you might not be - terribly happy. - However, since pull-then-update is such a common thing to - do, Mercurial lets you combine the two by passing the option to hg pull. - - If you look back at the output of hg pull in when we ran it without , you can see that it printed - a helpful reminder that we'd have to take an explicit step to - update the working directory: - - - - To find out what revision the working directory is at, use - the hg parents - command. - - &interaction.tour.parents; - - If you look back at , - you'll see arrows connecting each changeset. The node that - the arrow leads from in each case is a - parent, and the node that the arrow leads - to is its child. The working directory - has a parent in just the same way; this is the changeset that - the working directory currently contains. - - To update the working directory to a particular revision, - - give a revision number or changeset ID to the hg update command. - - &interaction.tour.older; - - If you omit an explicit revision, hg update will update to the tip - revision, as shown by the second call to hg update in the example - above. - - - - Pushing changes to another repository - - Mercurial lets us push changes to another - repository, from the repository we're currently visiting. - As with the example of hg - pull above, we'll create a temporary repository - to push our changes into. - - &interaction.tour.clone-push; - - The hg outgoing command - tells us what changes would be pushed into another - repository. - - &interaction.tour.outgoing; - - And the - hg push command does the - actual push. - - &interaction.tour.push; - - As with hg - pull, the hg push - command does not update the working directory in the - repository that it's pushing changes into. Unlike hg pull, hg - push does not provide a -u - option that updates the other repository's working directory. - This asymmetry is deliberate: the repository we're pushing to - might be on a remote server and shared between several people. - If we were to update its working directory while someone was - working in it, their work would be disrupted. - - What happens if we try to pull or push changes - and the receiving repository already has those changes? - Nothing too exciting. - - &interaction.tour.push.nothing; - - - Sharing changes over a network - - The commands we have covered in the previous few - sections are not limited to working with local repositories. - Each works in exactly the same fashion over a network - connection; simply pass in a URL instead of a local - path. - - &interaction.tour.outgoing.net; - - In this example, we - can see what changes we could push to the remote repository, - but the repository is understandably not set up to let - anonymous users push to it. - - &interaction.tour.push.net; - - -
- - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch02-tour-basic.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch02-tour-basic.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,1035 @@ + + + + + A tour of Mercurial: the basics + + + Installing Mercurial on your system + + Prebuilt binary packages of Mercurial are available for + every popular operating system. These make it easy to start + using Mercurial on your computer immediately. + + + Windows + + The best version of Mercurial for Windows is + TortoiseHg, which can be found at http://bitbucket.org/tortoisehg/stable/wiki/Home. + This package has no external dependencies; it just + works. It provides both command line and graphical + user interfaces. + + + + + Mac OS X + + Lee Cantey publishes an installer of Mercurial + for Mac OS X at http://mercurial.berkwood.com. + + + + Linux + + Because each Linux distribution has its own packaging + tools, policies, and rate of development, it's difficult to + give a comprehensive set of instructions on how to install + Mercurial binaries. The version of Mercurial that you will + end up with can vary depending on how active the person is who + maintains the package for your distribution. + + To keep things simple, I will focus on installing + Mercurial from the command line under the most popular Linux + distributions. Most of these distributions provide graphical + package managers that will let you install Mercurial with a + single click; the package name to look for is + mercurial. + + + Ubuntu and Debian: + apt-get install mercurial + Fedora: + yum install mercurial + OpenSUSE: + zypper install mercurial + Gentoo: + emerge mercurial + + + + + Solaris + + SunFreeWare, at http://www.sunfreeware.com, + provides prebuilt packages of Mercurial. + + + + + + + Getting started + + To begin, we'll use the hg + version command to find out whether Mercurial is + installed properly. The actual version information that it + prints isn't so important; we simply care whether the command + runs and prints anything at all. + + &interaction.tour.version; + + + Built-in help + + Mercurial provides a built-in help system. This is + invaluable for those times when you find yourself stuck + trying to remember how to run a command. If you are + completely stuck, simply run hg + help; it will print a brief list of commands, + along with a description of what each does. If you ask for + help on a specific command (as below), it prints more + detailed information. + + &interaction.tour.help; + + For a more impressive level of detail (which you won't + usually need) run hg help . The option is short for + , and tells + Mercurial to print more information than it usually + would. + + + + + Working with a repository + + In Mercurial, everything happens inside a + repository. The repository for a project + contains all of the files that belong to that + project, along with a historical record of the project's + files. + + There's nothing particularly magical about a repository; it + is simply a directory tree in your filesystem that Mercurial + treats as special. You can rename or delete a repository any + time you like, using either the command line or your file + browser. + + + Making a local copy of a repository + + Copying a repository is just a little + bit special. While you could use a normal file copying + command to make a copy of a repository, it's best to use a + built-in command that Mercurial provides. This command is + called hg clone, because it + makes an identical copy of an existing repository. + + &interaction.tour.clone; + + One advantage of using hg + clone is that, as we can see above, it lets us clone + repositories over the network. Another is that it remembers + where we cloned from, which we'll find useful soon when we + want to fetch new changes from another repository. + + If our clone succeeded, we should now have a local + directory called hello. + This directory will contain some files. + + &interaction.tour.ls; + + These files have the same contents and history in our + repository as they do in the repository we cloned. + + Every Mercurial repository is complete, + self-contained, and independent. It contains its own private + copy of a project's files and history. As we just mentioned, + a cloned repository remembers the location of the repository + it was cloned from, but Mercurial will not communicate with + that repository, or any other, unless you tell it to. + + What this means for now is that we're free to experiment + with our repository, safe in the knowledge that it's a private + sandbox that won't affect anyone else. + + + + What's in a repository? + + When we take a more detailed look inside a repository, we + can see that it contains a directory named .hg. This is where Mercurial + keeps all of its metadata for the repository. + + &interaction.tour.ls-a; + + The contents of the .hg directory and its + subdirectories are private to Mercurial. Every other file and + directory in the repository is yours to do with as you + please. + + To introduce a little terminology, the .hg directory is the + real repository, and all of the files and + directories that coexist with it are said to live in the + working directory. An easy way to + remember the distinction is that the + repository contains the + history of your project, while the + working directory contains a + snapshot of your project at a particular + point in history. + + + + + A tour through history + + One of the first things we might want to do with a new, + unfamiliar repository is understand its history. The hg log command gives us a view of + the history of changes in the repository. + + &interaction.tour.log; + + By default, this command prints a brief paragraph of output + for each change to the project that was recorded. In Mercurial + terminology, we call each of these recorded events a + changeset, because it can contain a record + of changes to several files. + + The fields in a record of output from hg log are as follows. + + + changeset: This + field has the format of a number, followed by a colon, + followed by a hexadecimal (or hex) + string. These are identifiers for the + changeset. The hex string is a unique identifier: the same + hex string will always refer to the same changeset in every + copy of this repository. The + number is shorter and easier to type than the hex string, + but it isn't unique: the same number in two different clones + of a repository may identify different changesets. + + user: The identity of the + person who created the changeset. This is a free-form + field, but it most often contains a person's name and email + address. + date: The date and time on + which the changeset was created, and the timezone in which + it was created. (The date and time are local to that + timezone; they display what time and date it was for the + person who created the changeset.) + summary: The first line of + the text message that the creator of the changeset entered + to describe the changeset. + + Some changesets, such as the first in the list above, + have a tag field. A tag is another way + to identify a changeset, by giving it an easy-to-remember + name. (The tag named tip is special: it + always refers to the newest change in a repository.) + + + + The default output printed by hg log is purely a summary; it is + missing a lot of detail. + + provides + a graphical representation of the history of the hello repository, to make it a + little easier to see which direction history is + flowing in. We'll be returning to this figure + several times in this chapter and the chapter that + follows. + +
+ Graphical history of the <filename + class="directory">hello</filename> repository + + + XXX add text + +
+ + + Changesets, revisions, and talking to other + people + + As English is a notoriously sloppy language, and computer + science has a hallowed history of terminological confusion + (why use one term when four will do?), revision control has a + variety of words and phrases that mean the same thing. If you + are talking about Mercurial history with other people, you + will find that the word changeset is often + compressed to change or (when written) + cset, and sometimes a changeset is referred to + as a revision or a rev. + + While it doesn't matter what word you + use to refer to the concept of a changeset, the + identifier that you use to refer to + a specific changeset is of + great importance. Recall that the changeset + field in the output from hg + log identifies a changeset using both a number and + a hexadecimal string. + + The revision number is a handy + notation that is only valid in that + repository. + The hexadecimal string is the + permanent, unchanging identifier that + will always identify that exact changeset in + every copy of the + repository. + + This distinction is important. If you send + someone an email talking about revision 33, + there's a high likelihood that their revision 33 will + not be the same as yours. The reason for + this is that a revision number depends on the order in which + changes arrived in a repository, and there is no guarantee + that the same changes will happen in the same order in + different repositories. Three changes a,b,c + can easily appear in one repository as + 0,1,2, while in another as + 0,2,1. + + Mercurial uses revision numbers purely as a convenient + shorthand. If you need to discuss a changeset with someone, + or make a record of a changeset for some other reason (for + example, in a bug report), use the hexadecimal + identifier. + + + + Viewing specific revisions + + To narrow the output of hg + log down to a single revision, use the (or ) option. You can use + either a revision number or a hexadecimal identifier, + and you can provide as many revisions as you want. + + &interaction.tour.log-r; + + If you want to see the history of several revisions + without having to list each one, you can use range + notation; this lets you express the idea I + want all revisions between abc and + def, inclusive. + + &interaction.tour.log.range; + + Mercurial also honours the order in which you specify + revisions, so hg log -r 2:4 + prints 2, 3, and 4. while hg log -r + 4:2 prints 4, 3, and 2. + + + + More detailed information + + While the summary information printed by hg log is useful if you already know + what you're looking for, you may need to see a complete + description of the change, or a list of the files changed, if + you're trying to decide whether a changeset is the one you're + looking for. The hg log + command's (or ) option gives you + this extra detail. + + &interaction.tour.log-v; + + If you want to see both the description and + content of a change, add the (or ) option. This displays + the content of a change as a unified diff + (if you've never seen a unified diff before, see for an overview). + + &interaction.tour.log-vp; + + The option is + tremendously useful, so it's well worth remembering. + + +
+ + + All about command options + + Let's take a brief break from exploring Mercurial commands + to discuss a pattern in the way that they work; you may find + this useful to keep in mind as we continue our tour. + + Mercurial has a consistent and straightforward approach to + dealing with the options that you can pass to commands. It + follows the conventions for options that are common to modern + Linux and Unix systems. + + + + Every option has a long name. For example, as + we've already seen, the hg + log command accepts a option. + + + Most options have short names, too. Instead + of , we can use + . (The reason that + some options don't have short names is that the options in + question are rarely used.) + + + Long options start with two dashes (e.g. + ), while short + options start with one (e.g. ). + + + Option naming and usage is consistent across + commands. For example, every command that lets you specify + a changeset ID or revision number accepts both and arguments. + + + If you are using short options, you can save typing by + running them together. For example, the command hg log -v -p -r 2 can be written + as hg log -vpr2. + + + + In the examples throughout this book, I usually + use short options instead of long. This simply reflects my own + preference, so don't read anything significant into it. + + Most commands that print output of some kind will print more + output when passed a + (or ) option, and + less when passed (or + ). + + + Option naming consistency + + Almost always, Mercurial commands use consistent option + names to refer to the same concepts. For instance, if a + command deals with changesets, you'll always identify them + with or . This consistent use of + option names makes it easier to remember what options a + particular command takes. + + + + + Making and reviewing changes + + Now that we have a grasp of viewing history in Mercurial, + let's take a look at making some changes and examining + them. + + The first thing we'll do is isolate our experiment in a + repository of its own. We use the hg + clone command, but we don't need to clone a copy of + the remote repository. Since we already have a copy of it + locally, we can just clone that instead. This is much faster + than cloning over the network, and cloning a local repository + uses less disk space in most cases, too + The saving of space arises when source and destination + repositories are on the same filesystem, in which case + Mercurial will use hardlinks to do copy-on-write sharing of + its internal metadata. If that explanation meant nothing to + you, don't worry: everything happens transparently and + automatically, and you don't need to understand it. + . + + &interaction.tour.reclone; + + As an aside, it's often good practice to keep a + pristine copy of a remote repository around, + which you can then make temporary clones of to create sandboxes + for each task you want to work on. This lets you work on + multiple tasks in parallel, each isolated from the others until + it's complete and you're ready to integrate it back. Because + local clones are so cheap, there's almost no overhead to cloning + and destroying repositories whenever you want. + + In our my-hello + repository, we have a file hello.c that + contains the classic hello, world program. + + &interaction.tour.cat1; + + Let's edit this file so that it prints a second line of + output. + + &interaction.tour.cat2; + + Mercurial's hg status + command will tell us what Mercurial knows about the files in the + repository. + + &interaction.tour.status; + + The hg status command + prints no output for some files, but a line starting with + M for + hello.c. Unless you tell it to, hg status will not print any output + for files that have not been modified. + + The M indicates that + Mercurial has noticed that we modified + hello.c. We didn't need to + inform Mercurial that we were going to + modify the file before we started, or that we had modified the + file after we were done; it was able to figure this out + itself. + + It's somewhat helpful to know that we've modified + hello.c, but we might prefer to know + exactly what changes we've made to it. To + do this, we use the hg diff + command. + + &interaction.tour.diff; + + + Understanding patches + + Remember to take a look at if you don't know how to read + output above. + + + + Recording changes in a new changeset + + We can modify files, build and test our changes, and use + hg status and hg diff to review our changes, until + we're satisfied with what we've done and arrive at a natural + stopping point where we want to record our work in a new + changeset. + + The hg commit command lets + us create a new changeset; we'll usually refer to this as + making a commit or + committing. + + + Setting up a username + + When you try to run hg + commit for the first time, it is not guaranteed to + succeed. Mercurial records your name and address with each + change that you commit, so that you and others will later be + able to tell who made each change. Mercurial tries to + automatically figure out a sensible username to commit the + change with. It will attempt each of the following methods, + in order: + + If you specify a option to the hg commit command on the command + line, followed by a username, this is always given the + highest precedence. + If you have set the HGUSER + environment variable, this is checked + next. + If you create a file in your home + directory called .hgrc, with a username entry, that will be + used next. To see what the contents of this file should + look like, refer to + below. + If you have set the EMAIL + environment variable, this will be used + next. + Mercurial will query your system to find out + your local user name and host name, and construct a + username from these components. Since this often results + in a username that is not very useful, it will print a + warning if it has to do + this. + + If all of these mechanisms fail, Mercurial will + fail, printing an error message. In this case, it will not + let you commit until you set up a + username. + You should think of the HGUSER environment + variable and the + option to the hg commit + command as ways to override Mercurial's + default selection of username. For normal use, the simplest + and most robust way to set a username for yourself is by + creating a .hgrc file; see + below for details. + + Creating a Mercurial configuration file + + To set a user name, use your favorite editor + to create a file called .hgrc in your home directory. + Mercurial will use this file to look up your personalised + configuration settings. The initial contents of your + .hgrc should look like + this. + + + <quote>Home directory</quote> on Windows + + When we refer to your home directory, on an English + language installation of Windows this will usually be a + folder named after your user name in + C:\Documents and Settings. You can + find out the exact name of your home directory by opening + a command prompt window and running the following + command. + + C:\> echo %UserProfile% + + + # This is a Mercurial configuration file. +[ui] +username = Firstname Lastname <email.address@example.net> + + The [ui] line begins a + section of the config file, so you can + read the username = ... + line as meaning set the value of the + username item in the + ui section. A section continues + until a new section begins, or the end of the file. + Mercurial ignores empty lines and treats any text from + # to the end of a line as + a comment. + + + + Choosing a user name + + You can use any text you like as the value of + the username config item, since this + information is for reading by other people, but will not be + interpreted by Mercurial. The convention that most people + follow is to use their name and email address, as in the + example above. + + Mercurial's built-in web server obfuscates + email addresses, to make it more difficult for the email + harvesting tools that spammers use. This reduces the + likelihood that you'll start receiving more junk email if + you publish a Mercurial repository on the + web. + + + + Writing a commit message + + When we commit a change, Mercurial drops us into + a text editor, to enter a message that will describe the + modifications we've made in this changeset. This is called + the commit message. It will be a record + for readers of what we did and why, and it will be printed by + hg log after we've finished + committing. + + &interaction.tour.commit; + + The editor that the hg + commit command drops us into will contain an empty + line or two, followed by a number of lines starting with + HG:. + + +This is where I type my commit comment. + +HG: Enter commit message. Lines beginning with 'HG:' are removed. +HG: -- +HG: user: Bryan O'Sullivan <bos@serpentine.com> +HG: branch 'default' +HG: changed hello.c + + Mercurial ignores the lines that start with + HG:; it uses them only to + tell us which files it's recording changes to. Modifying or + deleting these lines has no effect. + + + Writing a good commit message + + Since hg log + only prints the first line of a commit message by default, + it's best to write a commit message whose first line stands + alone. Here's a real example of a commit message that + doesn't follow this guideline, and hence + has a summary that is not readable. + + +changeset: 73:584af0e231be +user: Censored Person <censored.person@example.org> +date: Tue Sep 26 21:37:07 2006 -0700 +summary: include buildmeister/commondefs. Add exports. + + As far as the remainder of the contents of the + commit message are concerned, there are no hard-and-fast + rules. Mercurial itself doesn't interpret or care about the + contents of the commit message, though your project may have + policies that dictate a certain kind of formatting. + My personal preference is for short, but + informative, commit messages that tell me something that I + can't figure out with a quick glance at the output of hg log --patch. + If we run the hg + commit command without any arguments, it records + all of the changes we've made, as reported by hg status and hg diff. + + + A surprise for Subversion users + + Like other Mercurial commands, if we don't supply + explicit names to commit to the hg + commit, it will operate across a repository's + entire working directory. Be wary of this if you're coming + from the Subversion or CVS world, since you might expect it + to operate only on the current directory that you happen to + be visiting and its subdirectories. + + + + + Aborting a commit + + If you decide that you don't want to commit + while in the middle of editing a commit message, simply exit + from your editor without saving the file that it's editing. + This will cause nothing to happen to either the repository or + the working directory. + + + + Admiring our new handiwork + + Once we've finished the commit, we can use the + hg tip command to display the + changeset we just created. This command produces output that + is identical to hg log, but + it only displays the newest revision in the repository. + + &interaction.tour.tip; + + We refer to the newest revision in the + repository as the tip revision, or simply + the tip. + + By the way, the hg tip + command accepts many of the same options as hg log, so above indicates be + verbose, + specifies print a patch. The use of to print patches is another + example of the consistent naming we mentioned earlier. + + + + + Sharing changes + + We mentioned earlier that repositories in + Mercurial are self-contained. This means that the changeset we + just created exists only in our my-hello repository. Let's look + at a few ways that we can propagate this change into other + repositories. + + + Pulling changes from another repository + + To get started, let's clone our original + hello repository, which + does not contain the change we just committed. We'll call our + temporary repository hello-pull. + + &interaction.tour.clone-pull; + + We'll use the hg + pull command to bring changes from my-hello into hello-pull. However, blindly + pulling unknown changes into a repository is a somewhat scary + prospect. Mercurial provides the hg + incoming command to tell us what changes the + hg pull command + would pull into the repository, without + actually pulling the changes in. + + &interaction.tour.incoming; + + Bringing changes into a repository is a simple + matter of running the hg pull + command, and optionally telling it which repository to pull from. + + &interaction.tour.pull; + + As you can see from the before-and-after output + of hg tip, we have + successfully pulled changes into our repository. However, + Mercurial separates pulling changes in from updating the + working directory. There remains one step before we will see + the changes that we just pulled appear in the working + directory. + + + Pulling specific changes + + It is possible that due to the delay between + running hg incoming and + hg pull, you may not see + all changesets that will be brought from the other + repository. Suppose you're pulling changes from a repository + on the network somewhere. While you are looking at the + hg incoming output, and + before you pull those changes, someone might have committed + something in the remote repository. This means that it's + possible to pull more changes than you saw when using + hg incoming. + + If you only want to pull precisely the changes that were + listed by hg incoming, or + you have some other reason to pull a subset of changes, + simply identify the change that you want to pull by its + changeset ID, e.g. hg pull + -r7e95bb. + + + + + Updating the working directory + + We have so far glossed over the relationship + between a repository and its working directory. The hg pull command that we ran in + brought changes into the + repository, but if we check, there's no sign of those changes + in the working directory. This is because hg pull does not (by default) touch + the working directory. Instead, we use the hg update command to do this. + + &interaction.tour.update; + + It might seem a bit strange that hg pull doesn't update the working + directory automatically. There's actually a good reason for + this: you can use hg update + to update the working directory to the state it was in at + any revision in the history of the + repository. If you had the working directory updated to an + old revision&emdash;to hunt down the origin of a bug, + say&emdash;and ran a hg pull + which automatically updated the working directory to a new + revision, you might not be terribly happy. + + Since pull-then-update is such a common sequence + of operations, Mercurial lets you combine the two by passing + the option to hg pull. + + If you look back at the output of hg pull in when we ran it without , you can see that it printed + a helpful reminder that we'd have to take an explicit step to + update the working directory. + + To find out what revision the working directory + is at, use the hg parents + command. + + &interaction.tour.parents; + + If you look back at , you'll see arrows + connecting each changeset. The node that the arrow leads + from in each case is a parent, and the + node that the arrow leads to is its + child. The working directory has a parent in just the same + way; this is the changeset that the working directory + currently contains. + + To update the working directory to a particular + revision, give a revision number or changeset ID to the + hg update command. + + &interaction.tour.older; + + If you omit an explicit revision, hg update will update to the tip + revision, as shown by the second call to hg update in the example + above. + + + + Pushing changes to another repository + + Mercurial lets us push changes to another + repository, from the repository we're currently visiting. As + with the example of hg pull + above, we'll create a temporary repository to push our changes + into. + + &interaction.tour.clone-push; + + The hg outgoing + command tells us what changes would be pushed into another + repository. + + &interaction.tour.outgoing; + + And the hg push + command does the actual push. + + &interaction.tour.push; + + As with hg + pull, the hg push + command does not update the working directory in the + repository that it's pushing changes into. Unlike hg pull, hg + push does not provide a -u + option that updates the other repository's working directory. + This asymmetry is deliberate: the repository we're pushing to + might be on a remote server and shared between several people. + If we were to update its working directory while someone was + working in it, their work would be disrupted. + + What happens if we try to pull or push changes + and the receiving repository already has those changes? + Nothing too exciting. + + &interaction.tour.push.nothing; + + + + Default locations + + When we clone a repository, Mercurial records the location + of the repository we cloned in the + .hg/hgrc file of the new repository. If + we don't supply a location to hg pull from + or hg push to, those commands will use this + location as a default. The hg incoming + and hg outgoing commands do so too. + + If you open a repository's .hg/hgrc + file in a text editor, you will see contents like the + following. + + [paths] +default = http://www.selenic.com/repo/hg + + It is possible&emdash;and often useful&emdash;to have the + default location for hg push and + hg outgoing be different from those for + hg pull and hg incoming. + We can do this by adding a default-push + entry to the [paths] section of the + .hg/hgrc file, as follows. + + [paths] +default = http://www.selenic.com/repo/hg +default-push = http://hg.example.com/hg + + + + Sharing changes over a network + + The commands we have covered in the previous few + sections are not limited to working with local repositories. + Each works in exactly the same fashion over a network + connection; simply pass in a URL instead of a local + path. + + &interaction.tour.outgoing.net; + + In this example, we can see what changes we + could push to the remote repository, but the repository is + understandably not set up to let anonymous users push to + it. + + &interaction.tour.push.net; + + + + + Starting a new project + + It is just as easy to begin a new project as to work on one + that already exists. The hg init command + creates a new, empty Mercurial repository. + + &interaction.ch01-new.init; + + This simply creates a repository named + myproject in the current directory. + + &interaction.ch01-new.ls; + + We can tell that myproject is a + Mercurial repository, because it contains a + .hg directory. + + &interaction.ch01-new.ls2; + + If we want to add some pre-existing files to the repository, + we copy them into place, and tell Mercurial to start tracking + them using the hg add command. + + &interaction.ch01-new.add; + + Once we are satisfied that our project looks right, we + commit our changes. + + &interaction.ch01-new.commit; + + It takes just a few moments to start using Mercurial on a + new project, which is part of its appeal. Revision control is + now so easy to work with, we can use it on the smallest of + projects that we might not have considered with a more + complicated tool. + +
+ + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch02-tour-merge.xml --- a/en/ch02-tour-merge.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,401 +0,0 @@ - - - - - A tour of Mercurial: merging work - - We've now covered cloning a repository, making changes in a - repository, and pulling or pushing changes from one repository - into another. Our next step is merging - changes from separate repositories. - - - Merging streams of work - - Merging is a fundamental part of working with a distributed - revision control tool. - - Alice and Bob each have a personal copy of a - repository for a project they're collaborating on. Alice - fixes a bug in her repository; Bob adds a new feature in - his. They want the shared repository to contain both the - bug fix and the new feature. - - I frequently work on several different tasks for - a single project at once, each safely isolated in its own - repository. Working this way means that I often need to - merge one piece of my own work with another. - - - Because merging is such a common thing to need to do, - Mercurial makes it easy. Let's walk through the process. We'll - begin by cloning yet another repository (see how often they - spring up?) and making a change in it. - - &interaction.tour.merge.clone; - - We should now have two copies of - hello.c with different contents. The - histories of the two repositories have also diverged, as - illustrated in . - - &interaction.tour.merge.cat; - -
- Divergent recent histories of the <filename - class="directory">my-hello</filename> and <filename - class="directory">my-new-hello</filename> - repositories - - - XXX add text - -
- - We already know that pulling changes from our my-hello repository will have no - effect on the working directory. - - &interaction.tour.merge.pull; - - However, the hg pull - command says something about heads. - - - Head changesets - - A head is a change that has no descendants, or children, - as they're also known. The tip revision is thus a head, - because the newest revision in a repository doesn't have any - children, but a repository can contain more than one - head. - -
- Repository contents after pulling from <filename - class="directory">my-hello</filename> into <filename - class="directory">my-new-hello</filename> - - - - - XXX add text - -
- - In , you can - see the effect of the pull from my-hello into my-new-hello. The history that - was already present in my-new-hello is untouched, but - a new revision has been added. By referring to , we can see that the - changeset ID remains the same in the new - repository, but the revision number has - changed. (This, incidentally, is a fine example of why it's - not safe to use revision numbers when discussing changesets.) - We can view the heads in a repository using the hg heads command. - - &interaction.tour.merge.heads; - -
- - Performing the merge - - What happens if we try to use the normal hg update command to update to the - new tip? - - &interaction.tour.merge.update; - - Mercurial is telling us that the hg - update command won't do a merge; it won't update - the working directory when it thinks we might want to do - a merge, unless we force it to do so. Instead, we use the - hg merge command to merge the - two heads. - - &interaction.tour.merge.merge; - - This updates the working directory so that it contains - changes from both heads, which is - reflected in both the output of hg - parents and the contents of - hello.c. - - &interaction.tour.merge.parents; - - - - Committing the results of the merge - - Whenever we've done a merge, hg - parents will display two parents until we hg commit the results of the - merge. - - &interaction.tour.merge.commit; - - We now have a new tip revision; notice that it has - both of our former heads as its parents. - These are the same revisions that were previously displayed by - hg parents. - - &interaction.tour.merge.tip; - - In , you can see a - representation of what happens to the working directory during - the merge, and how this affects the repository when the commit - happens. During the merge, the working directory has two - parent changesets, and these become the parents of the new - changeset. - -
- Working directory and repository during merge, and - following commit - - - - - XXX add text - -
- - We sometimes talk about a merge having - sides: the left side is the first parent - in the output of hg parents, - and the right side is the second. If the working directory - was at e.g. revision 5 before we began a merge, that revision - will become the left side of the merge. -
-
- - - Merging conflicting changes - - Most merges are simple affairs, but sometimes you'll find - yourself merging changes where each side modifies the same portions - of the same files. Unless both modifications are identical, - this results in a conflict, where you have - to decide how to reconcile the different changes into something - coherent. - -
- Conflicting changes to a document - - - XXX add text - -
- - illustrates - an instance of two conflicting changes to a document. We - started with a single version of the file; then we made some - changes; while someone else made different changes to the same - text. Our task in resolving the conflicting changes is to - decide what the file should look like. - - Mercurial doesn't have a built-in facility for handling - conflicts. Instead, it runs an external program, usually one - that displays some kind of graphical conflict resolution - interface. By default, Mercurial tries to find one of several - different merging tools that are likely to be installed on your - system. It first tries a few fully automatic merging tools; if - these don't succeed (because the resolution process requires - human guidance) or aren't present, it tries a few - different graphical merging tools. - - It's also possible to get Mercurial to run another program - or script instead of hgmerge, by setting the - HGMERGE environment variable to the name of your - preferred program. - - - Using a graphical merge tool - - My preferred graphical merge tool is - kdiff3, which I'll use to describe the - features that are common to graphical file merging tools. You - can see a screenshot of kdiff3 in action in - . The kind of - merge it is performing is called a three-way - merge, because there are three different versions - of the file of interest to us. The tool thus splits the upper - portion of the window into three panes: - - At the left is the base - version of the file, i.e. the most recent version from - which the two versions we're trying to merge are - descended. - - In the middle is our version of - the file, with the contents that we modified. - - On the right is their version - of the file, the one that from the changeset that we're - trying to merge with. - - In the pane below these is the current - result of the merge. Our task is to - replace all of the red text, which indicates unresolved - conflicts, with some sensible merger of the - ours and theirs versions of the - file. - - All four of these panes are locked - together; if we scroll vertically or horizontally - in any of them, the others are updated to display the - corresponding sections of their respective files. - -
- Using <command>kdiff3</command> to merge versions of a - file - - - - - XXX add text - - -
- - For each conflicting portion of the file, we can choose to - resolve the conflict using some combination of text from the - base version, ours, or theirs. We can also manually edit the - merged file at any time, in case we need to make further - modifications. - - There are many file merging tools - available, too many to cover here. They vary in which - platforms they are available for, and in their particular - strengths and weaknesses. Most are tuned for merging files - containing plain text, while a few are aimed at specialised - file formats (generally XML). - -
- - A worked example - - In this example, we will reproduce the file modification - history of - above. Let's begin by creating a repository with a base - version of our document. - - &interaction.tour-merge-conflict.wife; - - We'll clone the repository and make a change to the - file. - - &interaction.tour-merge-conflict.cousin; - - And another clone, to simulate someone else making a - change to the file. (This hints at the idea that it's not all - that unusual to merge with yourself when you isolate tasks in - separate repositories, and indeed to find and resolve - conflicts while doing so.) - - &interaction.tour-merge-conflict.son; - - Having created two - different versions of the file, we'll set up an environment - suitable for running our merge. - - &interaction.tour-merge-conflict.pull; - - In this example, I'll set - HGMERGE to tell Mercurial to use the - non-interactive merge command. This is - bundled with many Unix-like systems. (If you're following this - example on your computer, don't bother setting - HGMERGE.) - - &interaction.tour-merge-conflict.merge; - - Because merge can't resolve the - conflicting changes, it leaves merge - markers inside the file that has conflicts, - indicating which lines have conflicts, and whether they came - from our version of the file or theirs. - - Mercurial can tell from the way merge - exits that it wasn't able to merge successfully, so it tells - us what commands we'll need to run if we want to redo the - merging operation. This could be useful if, for example, we - were running a graphical merge tool and quit because we were - confused or realised we had made a mistake. - - If automatic or manual merges fail, there's nothing to - prevent us from fixing up the affected files - ourselves, and committing the results of our merge: - - &interaction.tour-merge-conflict.commit; - - -
- - Simplifying the pull-merge-commit sequence - - The process of merging changes as outlined above is - straightforward, but requires running three commands in - sequence. - hg pull -u -hg merge -hg commit -m 'Merged remote changes' - In the case of the final commit, you also need to enter a - commit message, which is almost always going to be a piece of - uninteresting boilerplate text. - - It would be nice to reduce the number of steps needed, if - this were possible. Indeed, Mercurial is distributed with an - extension called fetch that - does just this. - - Mercurial provides a flexible extension mechanism that lets - people extend its functionality, while keeping the core of - Mercurial small and easy to deal with. Some extensions add new - commands that you can use from the command line, while others - work behind the scenes, for example adding - capabilities to the server. - - The fetch - extension adds a new command called, not surprisingly, hg fetch. This extension acts as a - combination of hg pull -u, - hg merge and hg commit. It begins by pulling - changes from another repository into the current repository. If - it finds that the changes added a new head to the repository, it - begins a merge, then (if the merge succeeded) commits the result - of the merge with an automatically-generated commit message. If - no new heads were added, it updates the working directory to the - new tip changeset. - - Enabling the fetch extension is easy. Edit the - .hgrc file in your home - directory, and either go to the extensions section or create an - extensions section. Then - add a line that simply reads - fetch=. - - [extensions] -fetch = - - (Normally, the right-hand side of the - = would indicate where to find - the extension, but since the fetch extension is in the standard - distribution, Mercurial knows where to search for it.) - - -
- - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch03-concepts.xml --- a/en/ch03-concepts.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,754 +0,0 @@ - - - - - Behind the scenes - - Unlike many revision control systems, the concepts - upon which Mercurial is built are simple enough that it's easy to - understand how the software really works. Knowing these details - certainly isn't necessary, so it is certainly safe to skip this - chapter. However, I think you will get more out of the software - with a mental model of what's going on. - - Being able to understand what's going on behind the - scenes gives me confidence that Mercurial has been carefully - designed to be both safe and - efficient. And just as importantly, if it's - easy for me to retain a good idea of what the software is doing - when I perform a revision control task, I'm less likely to be - surprised by its behavior. - - In this chapter, we'll initially cover the core concepts - behind Mercurial's design, then continue to discuss some of the - interesting details of its implementation. - - - Mercurial's historical record - - - Tracking the history of a single file - - When Mercurial tracks modifications to a file, it stores - the history of that file in a metadata object called a - filelog. Each entry in the filelog - contains enough information to reconstruct one revision of the - file that is being tracked. Filelogs are stored as files in - the .hg/store/data directory. A - filelog contains two kinds of information: revision data, and - an index to help Mercurial to find a revision - efficiently. - - A file that is large, or has a lot of history, has its - filelog stored in separate data - (.d suffix) and index - (.i suffix) files. For - small files without much history, the revision data and index - are combined in a single .i - file. The correspondence between a file in the working - directory and the filelog that tracks its history in the - repository is illustrated in . - -
- Relationships between files in working directory and - filelogs in repository - - - XXX add text - -
- -
- - Managing tracked files - - Mercurial uses a structure called a - manifest to collect together information - about the files that it tracks. Each entry in the manifest - contains information about the files present in a single - changeset. An entry records which files are present in the - changeset, the revision of each file, and a few other pieces - of file metadata. - - - - Recording changeset information - - The changelog contains information - about each changeset. Each revision records who committed a - change, the changeset comment, other pieces of - changeset-related information, and the revision of the - manifest to use. - - - - Relationships between revisions - - Within a changelog, a manifest, or a filelog, each - revision stores a pointer to its immediate parent (or to its - two parents, if it's a merge revision). As I mentioned above, - there are also relationships between revisions - across these structures, and they are - hierarchical in nature. - - For every changeset in a repository, there is exactly one - revision stored in the changelog. Each revision of the - changelog contains a pointer to a single revision of the - manifest. A revision of the manifest stores a pointer to a - single revision of each filelog tracked when that changeset - was created. These relationships are illustrated in - . - -
- Metadata relationships - - - XXX add text - -
- - As the illustration shows, there is - not a one to one - relationship between revisions in the changelog, manifest, or - filelog. If the manifest hasn't changed between two - changesets, the changelog entries for those changesets will - point to the same revision of the manifest. If a file that - Mercurial tracks hasn't changed between two changesets, the - entry for that file in the two revisions of the manifest will - point to the same revision of its filelog. - -
-
- - Safe, efficient storage - - The underpinnings of changelogs, manifests, and filelogs are - provided by a single structure called the - revlog. - - - Efficient storage - - The revlog provides efficient storage of revisions using a - delta mechanism. Instead of storing a - complete copy of a file for each revision, it stores the - changes needed to transform an older revision into the new - revision. For many kinds of file data, these deltas are - typically a fraction of a percent of the size of a full copy - of a file. - - Some obsolete revision control systems can only work with - deltas of text files. They must either store binary files as - complete snapshots or encoded into a text representation, both - of which are wasteful approaches. Mercurial can efficiently - handle deltas of files with arbitrary binary contents; it - doesn't need to treat text as special. - - - - Safe operation - - Mercurial only ever appends data to - the end of a revlog file. It never modifies a section of a - file after it has written it. This is both more robust and - efficient than schemes that need to modify or rewrite - data. - - In addition, Mercurial treats every write as part of a - transaction that can span a number of - files. A transaction is atomic: either - the entire transaction succeeds and its effects are all - visible to readers in one go, or the whole thing is undone. - This guarantee of atomicity means that if you're running two - copies of Mercurial, where one is reading data and one is - writing it, the reader will never see a partially written - result that might confuse it. - - The fact that Mercurial only appends to files makes it - easier to provide this transactional guarantee. The easier it - is to do stuff like this, the more confident you should be - that it's done correctly. - - - - Fast retrieval - - Mercurial cleverly avoids a pitfall common to all earlier - revision control systems: the problem of inefficient - retrieval. Most revision control systems store - the contents of a revision as an incremental series of - modifications against a snapshot. To - reconstruct a specific revision, you must first read the - snapshot, and then every one of the revisions between the - snapshot and your target revision. The more history that a - file accumulates, the more revisions you must read, hence the - longer it takes to reconstruct a particular revision. - -
- Snapshot of a revlog, with incremental deltas - - - XXX add text - -
- - The innovation that Mercurial applies to this problem is - simple but effective. Once the cumulative amount of delta - information stored since the last snapshot exceeds a fixed - threshold, it stores a new snapshot (compressed, of course), - instead of another delta. This makes it possible to - reconstruct any revision of a file - quickly. This approach works so well that it has since been - copied by several other revision control systems. - - illustrates - the idea. In an entry in a revlog's index file, Mercurial - stores the range of entries from the data file that it must - read to reconstruct a particular revision. - - - Aside: the influence of video compression - - If you're familiar with video compression or have ever - watched a TV feed through a digital cable or satellite - service, you may know that most video compression schemes - store each frame of video as a delta against its predecessor - frame. In addition, these schemes use lossy - compression techniques to increase the compression ratio, so - visual errors accumulate over the course of a number of - inter-frame deltas. - - Because it's possible for a video stream to drop - out occasionally due to signal glitches, and to - limit the accumulation of artefacts introduced by the lossy - compression process, video encoders periodically insert a - complete frame (called a key frame) into the - video stream; the next delta is generated against that - frame. This means that if the video signal gets - interrupted, it will resume once the next key frame is - received. Also, the accumulation of encoding errors - restarts anew with each key frame. - - -
- - Identification and strong integrity - - Along with delta or snapshot information, a revlog entry - contains a cryptographic hash of the data that it represents. - This makes it difficult to forge the contents of a revision, - and easy to detect accidental corruption. - - Hashes provide more than a mere check against corruption; - they are used as the identifiers for revisions. The changeset - identification hashes that you see as an end user are from - revisions of the changelog. Although filelogs and the - manifest also use hashes, Mercurial only uses these behind the - scenes. - - Mercurial verifies that hashes are correct when it - retrieves file revisions and when it pulls changes from - another repository. If it encounters an integrity problem, it - will complain and stop whatever it's doing. - - In addition to the effect it has on retrieval efficiency, - Mercurial's use of periodic snapshots makes it more robust - against partial data corruption. If a revlog becomes partly - corrupted due to a hardware error or system bug, it's often - possible to reconstruct some or most revisions from the - uncorrupted sections of the revlog, both before and after the - corrupted section. This would not be possible with a - delta-only storage model. - - -
- - Revision history, branching, and merging - - Every entry in a Mercurial revlog knows the identity of its - immediate ancestor revision, usually referred to as its - parent. In fact, a revision contains room - for not one parent, but two. Mercurial uses a special hash, - called the null ID, to represent the idea - there is no parent here. This hash is simply a - string of zeroes. - - In , you can see - an example of the conceptual structure of a revlog. Filelogs, - manifests, and changelogs all have this same structure; they - differ only in the kind of data stored in each delta or - snapshot. - - The first revision in a revlog (at the bottom of the image) - has the null ID in both of its parent slots. For a - normal revision, its first parent slot contains - the ID of its parent revision, and its second contains the null - ID, indicating that the revision has only one real parent. Any - two revisions that have the same parent ID are branches. A - revision that represents a merge between branches has two normal - revision IDs in its parent slots. - -
- The conceptual structure of a revlog - - - XXX add text - -
- -
- - The working directory - - In the working directory, Mercurial stores a snapshot of the - files from the repository as of a particular changeset. - - The working directory knows which changeset - it contains. When you update the working directory to contain a - particular changeset, Mercurial looks up the appropriate - revision of the manifest to find out which files it was tracking - at the time that changeset was committed, and which revision of - each file was then current. It then recreates a copy of each of - those files, with the same contents it had when the changeset - was committed. - - The dirstate contains Mercurial's - knowledge of the working directory. This details which - changeset the working directory is updated to, and all of the - files that Mercurial is tracking in the working - directory. - - Just as a revision of a revlog has room for two parents, so - that it can represent either a normal revision (with one parent) - or a merge of two earlier revisions, the dirstate has slots for - two parents. When you use the hg - update command, the changeset that you update to is - stored in the first parent slot, and the null ID - in the second. When you hg - merge with another changeset, the first parent - remains unchanged, and the second parent is filled in with the - changeset you're merging with. The hg - parents command tells you what the parents of the - dirstate are. - - - What happens when you commit - - The dirstate stores parent information for more than just - book-keeping purposes. Mercurial uses the parents of the - dirstate as the parents of a new - changeset when you perform a commit. - -
- The working directory can have two parents - - - XXX add text - -
- - shows the - normal state of the working directory, where it has a single - changeset as parent. That changeset is the - tip, the newest changeset in the - repository that has no children. - -
- The working directory gains new parents after a - commit - - - XXX add text - -
- - It's useful to think of the working directory as - the changeset I'm about to commit. Any files - that you tell Mercurial that you've added, removed, renamed, - or copied will be reflected in that changeset, as will - modifications to any files that Mercurial is already tracking; - the new changeset will have the parents of the working - directory as its parents. - - After a commit, Mercurial will update the - parents of the working directory, so that the first parent is - the ID of the new changeset, and the second is the null ID. - This is shown in . Mercurial - doesn't touch any of the files in the working directory when - you commit; it just modifies the dirstate to note its new - parents. - -
- - Creating a new head - - It's perfectly normal to update the working directory to a - changeset other than the current tip. For example, you might - want to know what your project looked like last Tuesday, or - you could be looking through changesets to see which one - introduced a bug. In cases like this, the natural thing to do - is update the working directory to the changeset you're - interested in, and then examine the files in the working - directory directly to see their contents as they were when you - committed that changeset. The effect of this is shown in - . - -
- The working directory, updated to an older - changeset - - - XXX add text - -
- - Having updated the working directory to an - older changeset, what happens if you make some changes, and - then commit? Mercurial behaves in the same way as I outlined - above. The parents of the working directory become the - parents of the new changeset. This new changeset has no - children, so it becomes the new tip. And the repository now - contains two changesets that have no children; we call these - heads. You can see the structure that - this creates in . - -
- After a commit made while synced to an older - changeset - - - XXX add text - -
- - - If you're new to Mercurial, you should keep in mind a - common error, which is to use the hg pull command without any - options. By default, the hg - pull command does not - update the working directory, so you'll bring new changesets - into your repository, but the working directory will stay - synced at the same changeset as before the pull. If you - make some changes and commit afterwards, you'll thus create - a new head, because your working directory isn't synced to - whatever the current tip is. - - I put the word error in - quotes because all that you need to do to rectify this - situation is hg merge, then - hg commit. In other words, - this almost never has negative consequences; it's just - something of a surprise for newcomers. I'll discuss other - ways to avoid this behavior, and why Mercurial behaves in - this initially surprising way, later on. - - -
- - Merging changes - - When you run the hg - merge command, Mercurial leaves the first parent - of the working directory unchanged, and sets the second parent - to the changeset you're merging with, as shown in . - -
- Merging two heads - - - - - XXX add text - -
- - Mercurial also has to modify the working directory, to - merge the files managed in the two changesets. Simplified a - little, the merging process goes like this, for every file in - the manifests of both changesets. - - If neither changeset has modified a file, do - nothing with that file. - - If one changeset has modified a file, and the - other hasn't, create the modified copy of the file in the - working directory. - - If one changeset has removed a file, and the - other hasn't (or has also deleted it), delete the file - from the working directory. - - If one changeset has removed a file, but the - other has modified the file, ask the user what to do: keep - the modified file, or remove it? - - If both changesets have modified a file, - invoke an external merge program to choose the new - contents for the merged file. This may require input from - the user. - - If one changeset has modified a file, and the - other has renamed or copied the file, make sure that the - changes follow the new name of the file. - - There are more details&emdash;merging has plenty of corner - cases&emdash;but these are the most common choices that are - involved in a merge. As you can see, most cases are - completely automatic, and indeed most merges finish - automatically, without requiring your input to resolve any - conflicts. - - When you're thinking about what happens when you commit - after a merge, once again the working directory is the - changeset I'm about to commit. After the hg merge command completes, the - working directory has two parents; these will become the - parents of the new changeset. - - Mercurial lets you perform multiple merges, but you must - commit the results of each individual merge as you go. This - is necessary because Mercurial only tracks two parents for - both revisions and the working directory. While it would be - technically possible to merge multiple changesets at once, the - prospect of user confusion and making a terrible mess of a - merge immediately becomes overwhelming. - -
- - - Merging and renames - - A surprising number of revision control systems pay little - or no attention to a file's name over - time. For instance, it used to be common that if a file got - renamed on one side of a merge, the changes from the other - side would be silently dropped. - - Mercurial records metadata when you tell it to perform a - rename or copy. It uses this metadata during a merge to do the - right thing in the case of a merge. For instance, if I rename - a file, and you edit it without renaming it, when we merge our - work the file will be renamed and have your edits - applied. - -
- - - Other interesting design features - - In the sections above, I've tried to highlight some of the - most important aspects of Mercurial's design, to illustrate that - it pays careful attention to reliability and performance. - However, the attention to detail doesn't stop there. There are - a number of other aspects of Mercurial's construction that I - personally find interesting. I'll detail a few of them here, - separate from the big ticket items above, so that - if you're interested, you can gain a better idea of the amount - of thinking that goes into a well-designed system. - - - Clever compression - - When appropriate, Mercurial will store both snapshots and - deltas in compressed form. It does this by always - trying to compress a snapshot or delta, - but only storing the compressed version if it's smaller than - the uncompressed version. - - This means that Mercurial does the right - thing when storing a file whose native form is - compressed, such as a zip archive or a JPEG - image. When these types of files are compressed a second - time, the resulting file is usually bigger than the - once-compressed form, and so Mercurial will store the plain - zip or JPEG. - - Deltas between revisions of a compressed file are usually - larger than snapshots of the file, and Mercurial again does - the right thing in these cases. It finds that - such a delta exceeds the threshold at which it should store a - complete snapshot of the file, so it stores the snapshot, - again saving space compared to a naive delta-only - approach. - - - Network recompression - - When storing revisions on disk, Mercurial uses the - deflate compression algorithm (the same one - used by the popular zip archive format), - which balances good speed with a respectable compression - ratio. However, when transmitting revision data over a - network connection, Mercurial uncompresses the compressed - revision data. - - If the connection is over HTTP, Mercurial recompresses - the entire stream of data using a compression algorithm that - gives a better compression ratio (the Burrows-Wheeler - algorithm from the widely used bzip2 - compression package). This combination of algorithm and - compression of the entire stream (instead of a revision at a - time) substantially reduces the number of bytes to be - transferred, yielding better network performance over most - kinds of network. - - (If the connection is over ssh, - Mercurial doesn't recompress the - stream, because ssh can already do this - itself.) - - - - - Read/write ordering and atomicity - - Appending to files isn't the whole story when - it comes to guaranteeing that a reader won't see a partial - write. If you recall , - revisions in - the changelog point to revisions in the manifest, and - revisions in the manifest point to revisions in filelogs. - This hierarchy is deliberate. - - A writer starts a transaction by writing filelog and - manifest data, and doesn't write any changelog data until - those are finished. A reader starts by reading changelog - data, then manifest data, followed by filelog data. - - Since the writer has always finished writing filelog and - manifest data before it writes to the changelog, a reader will - never read a pointer to a partially written manifest revision - from the changelog, and it will never read a pointer to a - partially written filelog revision from the manifest. - - - - Concurrent access - - The read/write ordering and atomicity guarantees mean that - Mercurial never needs to lock a - repository when it's reading data, even if the repository is - being written to while the read is occurring. This has a big - effect on scalability; you can have an arbitrary number of - Mercurial processes safely reading data from a repository - safely all at once, no matter whether it's being written to or - not. - - The lockless nature of reading means that if you're - sharing a repository on a multi-user system, you don't need to - grant other local users permission to - write to your repository in order for - them to be able to clone it or pull changes from it; they only - need read permission. (This is - not a common feature among revision - control systems, so don't take it for granted! Most require - readers to be able to lock a repository to access it safely, - and this requires write permission on at least one directory, - which of course makes for all kinds of nasty and annoying - security and administrative problems.) - - Mercurial uses locks to ensure that only one process can - write to a repository at a time (the locking mechanism is safe - even over filesystems that are notoriously hostile to locking, - such as NFS). If a repository is locked, a writer will wait - for a while to retry if the repository becomes unlocked, but - if the repository remains locked for too long, the process - attempting to write will time out after a while. This means - that your daily automated scripts won't get stuck forever and - pile up if a system crashes unnoticed, for example. (Yes, the - timeout is configurable, from zero to infinity.) - - - Safe dirstate access - - As with revision data, Mercurial doesn't take a lock to - read the dirstate file; it does acquire a lock to write it. - To avoid the possibility of reading a partially written copy - of the dirstate file, Mercurial writes to a file with a - unique name in the same directory as the dirstate file, then - renames the temporary file atomically to - dirstate. The file named - dirstate is thus guaranteed to be - complete, not partially written. - - - - - Avoiding seeks - - Critical to Mercurial's performance is the avoidance of - seeks of the disk head, since any seek is far more expensive - than even a comparatively large read operation. - - This is why, for example, the dirstate is stored in a - single file. If there were a dirstate file per directory that - Mercurial tracked, the disk would seek once per directory. - Instead, Mercurial reads the entire single dirstate file in - one step. - - Mercurial also uses a copy on write scheme - when cloning a repository on local storage. Instead of - copying every revlog file from the old repository into the new - repository, it makes a hard link, which is a - shorthand way to say these two names point to the same - file. When Mercurial is about to write to one of a - revlog's files, it checks to see if the number of names - pointing at the file is greater than one. If it is, more than - one repository is using the file, so Mercurial makes a new - copy of the file that is private to this repository. - - A few revision control developers have pointed out that - this idea of making a complete private copy of a file is not - very efficient in its use of storage. While this is true, - storage is cheap, and this method gives the highest - performance while deferring most book-keeping to the operating - system. An alternative scheme would most likely reduce - performance and increase the complexity of the software, each - of which is much more important to the feel of - day-to-day use. - - - - Other contents of the dirstate - - Because Mercurial doesn't force you to tell it when you're - modifying a file, it uses the dirstate to store some extra - information so it can determine efficiently whether you have - modified a file. For each file in the working directory, it - stores the time that it last modified the file itself, and the - size of the file at that time. - - When you explicitly hg - add, hg remove, - hg rename or hg copy files, Mercurial updates the - dirstate so that it knows what to do with those files when you - commit. - - When Mercurial is checking the states of files in the - working directory, it first checks a file's modification time. - If that has not changed, the file must not have been modified. - If the file's size has changed, the file must have been - modified. If the modification time has changed, but the size - has not, only then does Mercurial need to read the actual - contents of the file to see if they've changed. Storing these - few extra pieces of information dramatically reduces the - amount of data that Mercurial needs to read, which yields - large performance improvements compared to other revision - control systems. - - - -
- - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch03-tour-merge.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch03-tour-merge.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,454 @@ + + + + + A tour of Mercurial: merging work + + We've now covered cloning a repository, making changes in a + repository, and pulling or pushing changes from one repository + into another. Our next step is merging + changes from separate repositories. + + + Merging streams of work + + Merging is a fundamental part of working with a distributed + revision control tool. Here are a few cases in which the need + to merge work arises. + + + Alice and Bob each have a personal copy of a + repository for a project they're collaborating on. Alice + fixes a bug in her repository; Bob adds a new feature in + his. They want the shared repository to contain both the + bug fix and the new feature. + + + Cynthia frequently works on several different + tasks for a single project at once, each safely isolated in + its own repository. Working this way means that she often + needs to merge one piece of her own work with + another. + + + + Because we need to merge often, Mercurial makes + the process easy. Let's walk through a merge. We'll begin by + cloning yet another repository (see how often they spring up?) + and making a change in it. + + &interaction.tour.merge.clone; + + We should now have two copies of + hello.c with different contents. The + histories of the two repositories have also diverged, as + illustrated in . Here is a copy of our + file from one repository. + + &interaction.tour.merge.cat1; + + And here is our slightly different version from the other + repository. + + &interaction.tour.merge.cat2; + +
+ Divergent recent histories of the <filename + class="directory">my-hello</filename> and <filename + class="directory">my-new-hello</filename> + repositories + + + XXX add text + +
+ + We already know that pulling changes from our my-hello repository will have no + effect on the working directory. + + &interaction.tour.merge.pull; + + However, the hg pull + command says something about heads. + + + Head changesets + + Remember that Mercurial records what the parent + of each change is. If a change has a parent, we call it a + child or descendant of the parent. A head is a change that + has no children. The tip revision is thus a head, because the + newest revision in a repository doesn't have any children. + There are times when a repository can contain more than one + head. + +
+ Repository contents after pulling from <filename + class="directory">my-hello</filename> into <filename + class="directory">my-new-hello</filename> + + + + + XXX add text + +
+ + In , you can + see the effect of the pull from my-hello into my-new-hello. The history that + was already present in my-new-hello is untouched, but + a new revision has been added. By referring to , we can see that the + changeset ID remains the same in the new + repository, but the revision number has + changed. (This, incidentally, is a fine example of why it's + not safe to use revision numbers when discussing changesets.) + We can view the heads in a repository using the hg heads command. + + &interaction.tour.merge.heads; +
+ + + Performing the merge + + What happens if we try to use the normal hg update command to update to the + new tip? + + &interaction.tour.merge.update; + + Mercurial is telling us that the hg update command won't do a merge; + it won't update the working directory when it thinks we might + want to do a merge, unless we force it to do so. + (Incidentally, forcing the update with hg update + -C would revert any uncommitted changes in the + working directory.) + + To start a merge between the two heads, we use the + hg merge command. + + &interaction.tour.merge.merge; + + We resolve the contents of hello.c + +This updates the working directory so that it + contains changes from both heads, which + is reflected in both the output of hg + parents and the contents of + hello.c. + + &interaction.tour.merge.parents; + + + + Committing the results of the merge + + Whenever we've done a merge, hg + parents will display two parents until we hg commit the results of the + merge. + + &interaction.tour.merge.commit; + + We now have a new tip revision; notice that it has + both of our former heads as its parents. + These are the same revisions that were previously displayed by + hg parents. + + &interaction.tour.merge.tip; + + In , you can see a + representation of what happens to the working directory during + the merge, and how this affects the repository when the commit + happens. During the merge, the working directory has two + parent changesets, and these become the parents of the new + changeset. + +
+ Working directory and repository during merge, and + following commit + + + + + XXX add text + +
+ + We sometimes talk about a merge having + sides: the left side is the first parent + in the output of hg parents, + and the right side is the second. If the working directory + was at e.g. revision 5 before we began a merge, that revision + will become the left side of the merge. +
+
+ + + Merging conflicting changes + + Most merges are simple affairs, but sometimes you'll find + yourself merging changes where each side modifies the same portions + of the same files. Unless both modifications are identical, + this results in a conflict, where you have + to decide how to reconcile the different changes into something + coherent. + +
+ Conflicting changes to a document + + + XXX add text + +
+ + illustrates + an instance of two conflicting changes to a document. We + started with a single version of the file; then we made some + changes; while someone else made different changes to the same + text. Our task in resolving the conflicting changes is to + decide what the file should look like. + + Mercurial doesn't have a built-in facility for handling + conflicts. Instead, it runs an external program, usually one + that displays some kind of graphical conflict resolution + interface. By default, Mercurial tries to find one of several + different merging tools that are likely to be installed on your + system. It first tries a few fully automatic merging tools; if + these don't succeed (because the resolution process requires + human guidance) or aren't present, it tries a few + different graphical merging tools. + + It's also possible to get Mercurial to run a + specific program or script, by setting the + HGMERGE environment variable to the name of your + preferred program. + + + Using a graphical merge tool + + My preferred graphical merge tool is + kdiff3, which I'll use to describe the + features that are common to graphical file merging tools. You + can see a screenshot of kdiff3 in action in + . The kind of + merge it is performing is called a three-way + merge, because there are three different versions + of the file of interest to us. The tool thus splits the upper + portion of the window into three panes: + + At the left is the base + version of the file, i.e. the most recent version from + which the two versions we're trying to merge are + descended. + + In the middle is our version of + the file, with the contents that we modified. + + On the right is their version + of the file, the one that from the changeset that we're + trying to merge with. + + In the pane below these is the current + result of the merge. Our task is to + replace all of the red text, which indicates unresolved + conflicts, with some sensible merger of the + ours and theirs versions of the + file. + + All four of these panes are locked + together; if we scroll vertically or horizontally + in any of them, the others are updated to display the + corresponding sections of their respective files. + +
+ Using <command>kdiff3</command> to merge versions of a + file + + + + + XXX add text + + +
+ + For each conflicting portion of the file, we can choose to + resolve the conflict using some combination of text from the + base version, ours, or theirs. We can also manually edit the + merged file at any time, in case we need to make further + modifications. + + There are many file merging tools + available, too many to cover here. They vary in which + platforms they are available for, and in their particular + strengths and weaknesses. Most are tuned for merging files + containing plain text, while a few are aimed at specialised + file formats (generally XML). +
+ + + A worked example + + In this example, we will reproduce the file modification + history of + above. Let's begin by creating a repository with a base + version of our document. + + &interaction.tour-merge-conflict.wife; + + We'll clone the repository and make a change to the + file. + + &interaction.tour-merge-conflict.cousin; + + And another clone, to simulate someone else making a + change to the file. (This hints at the idea that it's not all + that unusual to merge with yourself when you isolate tasks in + separate repositories, and indeed to find and resolve + conflicts while doing so.) + + &interaction.tour-merge-conflict.son; + + Having created two + different versions of the file, we'll set up an environment + suitable for running our merge. + + &interaction.tour-merge-conflict.pull; + + In this example, I'll set + HGMERGE to tell Mercurial to use the + non-interactive merge command. This is + bundled with many Unix-like systems. (If you're following this + example on your computer, don't bother setting + HGMERGE. You'll get dropped into a GUI file + merge tool instead, which is much preferable.) + + &interaction.tour-merge-conflict.merge; + + Because merge can't resolve the + conflicting changes, it leaves merge + markers inside the file that has conflicts, + indicating which lines have conflicts, and whether they came + from our version of the file or theirs. + + Mercurial can tell from the way merge + exits that it wasn't able to merge successfully, so it tells + us what commands we'll need to run if we want to redo the + merging operation. This could be useful if, for example, we + were running a graphical merge tool and quit because we were + confused or realised we had made a mistake. + + If automatic or manual merges fail, there's nothing to + prevent us from fixing up the affected files + ourselves, and committing the results of our merge: + + &interaction.tour-merge-conflict.commit; + + + Where is the <command>hg resolve</command> command? + + The hg resolve command was introduced + in Mercurial 1.1, which was released in December 2008. If + you are using an older version of Mercurial (run hg + version to see), this command will not be + present. If your version of Mercurial is older than 1.1, + you should strongly consider upgrading to a newer version + before trying to tackle complicated merges. + + +
+ + + Simplifying the pull-merge-commit sequence + + The process of merging changes as outlined above is + straightforward, but requires running three commands in + sequence. + hg pull -u +hg merge +hg commit -m 'Merged remote changes' + In the case of the final commit, you also need to enter a + commit message, which is almost always going to be a piece of + uninteresting boilerplate text. + + It would be nice to reduce the number of steps needed, if + this were possible. Indeed, Mercurial is distributed with an + extension called fetch that + does just this. + + Mercurial provides a flexible extension mechanism that lets + people extend its functionality, while keeping the core of + Mercurial small and easy to deal with. Some extensions add new + commands that you can use from the command line, while others + work behind the scenes, for example adding + capabilities to Mercurial's built-in server mode. + + The fetch + extension adds a new command called, not surprisingly, hg fetch. This extension acts as a + combination of hg pull -u, + hg merge and hg commit. It begins by pulling + changes from another repository into the current repository. If + it finds that the changes added a new head to the repository, it + updates to the new head, begins a merge, then (if the merge + succeeded) commits the result of the merge with an + automatically-generated commit message. If no new heads were + added, it updates the working directory to the new tip + changeset. + + Enabling the fetch extension is easy. Edit the + .hgrc file in your home + directory, and either go to the extensions section or create an + extensions section. Then + add a line that simply reads + fetch=. + + [extensions] +fetch = + + (Normally, the right-hand side of the + = would indicate where to find + the extension, but since the fetch extension is in the standard + distribution, Mercurial knows where to search for it.) + + + + Renaming, copying, and merging + + During the life of a project, we will often want to change + the layout of its files and directories. This can be as simple + as renaming a single file, or as complex as restructuring the + entire hierarchy of files within the project. + + Mercurial supports these kinds of complex changes fluently, + provided we tell it what we're doing. If we want to rename a + file, we should use the hg rename + If you're a Unix user, you'll be glad to know that the + hg rename command can be abbreviated as + hg mv. + command to rename it, so that Mercurial can do the + right thing later when we merge. + + We will cover the use of these commands in more detail in + . + +
+ + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch04-concepts.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch04-concepts.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,778 @@ + + + + + Behind the scenes + + Unlike many revision control systems, the concepts + upon which Mercurial is built are simple enough that it's easy to + understand how the software really works. Knowing these details + certainly isn't necessary, so it is certainly safe to skip this + chapter. However, I think you will get more out of the software + with a mental model of what's going on. + + Being able to understand what's going on behind the + scenes gives me confidence that Mercurial has been carefully + designed to be both safe and + efficient. And just as importantly, if it's + easy for me to retain a good idea of what the software is doing + when I perform a revision control task, I'm less likely to be + surprised by its behavior. + + In this chapter, we'll initially cover the core concepts + behind Mercurial's design, then continue to discuss some of the + interesting details of its implementation. + + + Mercurial's historical record + + + Tracking the history of a single file + + When Mercurial tracks modifications to a file, it stores + the history of that file in a metadata object called a + filelog. Each entry in the filelog + contains enough information to reconstruct one revision of the + file that is being tracked. Filelogs are stored as files in + the .hg/store/data directory. A + filelog contains two kinds of information: revision data, and + an index to help Mercurial to find a revision + efficiently. + + A file that is large, or has a lot of history, has its + filelog stored in separate data + (.d suffix) and index + (.i suffix) files. For + small files without much history, the revision data and index + are combined in a single .i + file. The correspondence between a file in the working + directory and the filelog that tracks its history in the + repository is illustrated in . + +
+ Relationships between files in working directory and + filelogs in repository + + + XXX add text + +
+ +
+ + Managing tracked files + + Mercurial uses a structure called a + manifest to collect together information + about the files that it tracks. Each entry in the manifest + contains information about the files present in a single + changeset. An entry records which files are present in the + changeset, the revision of each file, and a few other pieces + of file metadata. + + + + Recording changeset information + + The changelog contains information + about each changeset. Each revision records who committed a + change, the changeset comment, other pieces of + changeset-related information, and the revision of the + manifest to use. + + + + Relationships between revisions + + Within a changelog, a manifest, or a filelog, each + revision stores a pointer to its immediate parent (or to its + two parents, if it's a merge revision). As I mentioned above, + there are also relationships between revisions + across these structures, and they are + hierarchical in nature. + + For every changeset in a repository, there is exactly one + revision stored in the changelog. Each revision of the + changelog contains a pointer to a single revision of the + manifest. A revision of the manifest stores a pointer to a + single revision of each filelog tracked when that changeset + was created. These relationships are illustrated in + . + +
+ Metadata relationships + + + XXX add text + +
+ + As the illustration shows, there is + not a one to one + relationship between revisions in the changelog, manifest, or + filelog. If a file that + Mercurial tracks hasn't changed between two changesets, the + entry for that file in the two revisions of the manifest will + point to the same revision of its filelog + It is possible (though unusual) for the manifest to + remain the same between two changesets, in which case the + changelog entries for those changesets will point to the + same revision of the manifest. + . + +
+
+ + Safe, efficient storage + + The underpinnings of changelogs, manifests, and filelogs are + provided by a single structure called the + revlog. + + + Efficient storage + + The revlog provides efficient storage of revisions using a + delta mechanism. Instead of storing a + complete copy of a file for each revision, it stores the + changes needed to transform an older revision into the new + revision. For many kinds of file data, these deltas are + typically a fraction of a percent of the size of a full copy + of a file. + + Some obsolete revision control systems can only work with + deltas of text files. They must either store binary files as + complete snapshots or encoded into a text representation, both + of which are wasteful approaches. Mercurial can efficiently + handle deltas of files with arbitrary binary contents; it + doesn't need to treat text as special. + + + + Safe operation + + Mercurial only ever appends data to + the end of a revlog file. It never modifies a section of a + file after it has written it. This is both more robust and + efficient than schemes that need to modify or rewrite + data. + + In addition, Mercurial treats every write as part of a + transaction that can span a number of + files. A transaction is atomic: either + the entire transaction succeeds and its effects are all + visible to readers in one go, or the whole thing is undone. + This guarantee of atomicity means that if you're running two + copies of Mercurial, where one is reading data and one is + writing it, the reader will never see a partially written + result that might confuse it. + + The fact that Mercurial only appends to files makes it + easier to provide this transactional guarantee. The easier it + is to do stuff like this, the more confident you should be + that it's done correctly. + + + + Fast retrieval + + Mercurial cleverly avoids a pitfall common to + all earlier revision control systems: the problem of + inefficient retrieval. Most revision + control systems store the contents of a revision as an + incremental series of modifications against a + snapshot. (Some base the snapshot on the + oldest revision, others on the newest.) To reconstruct a + specific revision, you must first read the snapshot, and then + every one of the revisions between the snapshot and your + target revision. The more history that a file accumulates, + the more revisions you must read, hence the longer it takes to + reconstruct a particular revision. + +
+ Snapshot of a revlog, with incremental deltas + + + XXX add text + +
+ + The innovation that Mercurial applies to this problem is + simple but effective. Once the cumulative amount of delta + information stored since the last snapshot exceeds a fixed + threshold, it stores a new snapshot (compressed, of course), + instead of another delta. This makes it possible to + reconstruct any revision of a file + quickly. This approach works so well that it has since been + copied by several other revision control systems. + + illustrates + the idea. In an entry in a revlog's index file, Mercurial + stores the range of entries from the data file that it must + read to reconstruct a particular revision. + + + Aside: the influence of video compression + + If you're familiar with video compression or + have ever watched a TV feed through a digital cable or + satellite service, you may know that most video compression + schemes store each frame of video as a delta against its + predecessor frame. + + Mercurial borrows this idea to make it + possible to reconstruct a revision from a snapshot and a + small number of deltas. + + +
+ + Identification and strong integrity + + Along with delta or snapshot information, a revlog entry + contains a cryptographic hash of the data that it represents. + This makes it difficult to forge the contents of a revision, + and easy to detect accidental corruption. + + Hashes provide more than a mere check against corruption; + they are used as the identifiers for revisions. The changeset + identification hashes that you see as an end user are from + revisions of the changelog. Although filelogs and the + manifest also use hashes, Mercurial only uses these behind the + scenes. + + Mercurial verifies that hashes are correct when it + retrieves file revisions and when it pulls changes from + another repository. If it encounters an integrity problem, it + will complain and stop whatever it's doing. + + In addition to the effect it has on retrieval efficiency, + Mercurial's use of periodic snapshots makes it more robust + against partial data corruption. If a revlog becomes partly + corrupted due to a hardware error or system bug, it's often + possible to reconstruct some or most revisions from the + uncorrupted sections of the revlog, both before and after the + corrupted section. This would not be possible with a + delta-only storage model. + +
+ + + Revision history, branching, and merging + + Every entry in a Mercurial revlog knows the identity of its + immediate ancestor revision, usually referred to as its + parent. In fact, a revision contains room + for not one parent, but two. Mercurial uses a special hash, + called the null ID, to represent the idea + there is no parent here. This hash is simply a + string of zeroes. + + In , you can see + an example of the conceptual structure of a revlog. Filelogs, + manifests, and changelogs all have this same structure; they + differ only in the kind of data stored in each delta or + snapshot. + + The first revision in a revlog (at the bottom of the image) + has the null ID in both of its parent slots. For a + normal revision, its first parent slot contains + the ID of its parent revision, and its second contains the null + ID, indicating that the revision has only one real parent. Any + two revisions that have the same parent ID are branches. A + revision that represents a merge between branches has two normal + revision IDs in its parent slots. + +
+ The conceptual structure of a revlog + + + XXX add text + +
+ +
+ + The working directory + + In the working directory, Mercurial stores a snapshot of the + files from the repository as of a particular changeset. + + The working directory knows which changeset + it contains. When you update the working directory to contain a + particular changeset, Mercurial looks up the appropriate + revision of the manifest to find out which files it was tracking + at the time that changeset was committed, and which revision of + each file was then current. It then recreates a copy of each of + those files, with the same contents it had when the changeset + was committed. + + The dirstate is a special + structure that contains Mercurial's knowledge of the working + directory. It is maintained as a file named + .hg/dirstate inside a repository. The + dirstate details which changeset the working directory is + updated to, and all of the files that Mercurial is tracking in + the working directory. It also lets Mercurial quickly notice + changed files, by recording their checkout times and + sizes. + + Just as a revision of a revlog has room for two parents, so + that it can represent either a normal revision (with one parent) + or a merge of two earlier revisions, the dirstate has slots for + two parents. When you use the hg + update command, the changeset that you update to is + stored in the first parent slot, and the null ID + in the second. When you hg + merge with another changeset, the first parent + remains unchanged, and the second parent is filled in with the + changeset you're merging with. The hg + parents command tells you what the parents of the + dirstate are. + + + What happens when you commit + + The dirstate stores parent information for more than just + book-keeping purposes. Mercurial uses the parents of the + dirstate as the parents of a new + changeset when you perform a commit. + +
+ The working directory can have two parents + + + XXX add text + +
+ + shows the + normal state of the working directory, where it has a single + changeset as parent. That changeset is the + tip, the newest changeset in the + repository that has no children. + +
+ The working directory gains new parents after a + commit + + + XXX add text + +
+ + It's useful to think of the working directory as + the changeset I'm about to commit. Any files + that you tell Mercurial that you've added, removed, renamed, + or copied will be reflected in that changeset, as will + modifications to any files that Mercurial is already tracking; + the new changeset will have the parents of the working + directory as its parents. + + After a commit, Mercurial will update the + parents of the working directory, so that the first parent is + the ID of the new changeset, and the second is the null ID. + This is shown in . Mercurial + doesn't touch any of the files in the working directory when + you commit; it just modifies the dirstate to note its new + parents. + +
+ + Creating a new head + + It's perfectly normal to update the working directory to a + changeset other than the current tip. For example, you might + want to know what your project looked like last Tuesday, or + you could be looking through changesets to see which one + introduced a bug. In cases like this, the natural thing to do + is update the working directory to the changeset you're + interested in, and then examine the files in the working + directory directly to see their contents as they were when you + committed that changeset. The effect of this is shown in + . + +
+ The working directory, updated to an older + changeset + + + XXX add text + +
+ + Having updated the working directory to an + older changeset, what happens if you make some changes, and + then commit? Mercurial behaves in the same way as I outlined + above. The parents of the working directory become the + parents of the new changeset. This new changeset has no + children, so it becomes the new tip. And the repository now + contains two changesets that have no children; we call these + heads. You can see the structure that + this creates in . + +
+ After a commit made while synced to an older + changeset + + + XXX add text + +
+ + + If you're new to Mercurial, you should keep + in mind a common error, which is to use the + hg pull command without any + options. By default, the hg + pull command does not + update the working directory, so you'll bring new changesets + into your repository, but the working directory will stay + synced at the same changeset as before the pull. If you + make some changes and commit afterwards, you'll thus create + a new head, because your working directory isn't synced to + whatever the current tip is. To combine the operation of a + pull, followed by an update, run hg pull + -u. + + I put the word error in quotes + because all that you need to do to rectify the situation + where you created a new head by accident is + hg merge, then hg commit. In other words, this + almost never has negative consequences; it's just something + of a surprise for newcomers. I'll discuss other ways to + avoid this behavior, and why Mercurial behaves in this + initially surprising way, later on. + + +
+ + Merging changes + + When you run the hg + merge command, Mercurial leaves the first parent + of the working directory unchanged, and sets the second parent + to the changeset you're merging with, as shown in . + +
+ Merging two heads + + + + + XXX add text + +
+ + Mercurial also has to modify the working directory, to + merge the files managed in the two changesets. Simplified a + little, the merging process goes like this, for every file in + the manifests of both changesets. + + If neither changeset has modified a file, do + nothing with that file. + + If one changeset has modified a file, and the + other hasn't, create the modified copy of the file in the + working directory. + + If one changeset has removed a file, and the + other hasn't (or has also deleted it), delete the file + from the working directory. + + If one changeset has removed a file, but the + other has modified the file, ask the user what to do: keep + the modified file, or remove it? + + If both changesets have modified a file, + invoke an external merge program to choose the new + contents for the merged file. This may require input from + the user. + + If one changeset has modified a file, and the + other has renamed or copied the file, make sure that the + changes follow the new name of the file. + + There are more details&emdash;merging has plenty of corner + cases&emdash;but these are the most common choices that are + involved in a merge. As you can see, most cases are + completely automatic, and indeed most merges finish + automatically, without requiring your input to resolve any + conflicts. + + When you're thinking about what happens when you commit + after a merge, once again the working directory is the + changeset I'm about to commit. After the hg merge command completes, the + working directory has two parents; these will become the + parents of the new changeset. + + Mercurial lets you perform multiple merges, but + you must commit the results of each individual merge as you + go. This is necessary because Mercurial only tracks two + parents for both revisions and the working directory. While + it would be technically feasible to merge multiple changesets + at once, Mercurial avoids this for simplicity. With multi-way + merges, the risks of user confusion, nasty conflict + resolution, and making a terrible mess of a merge would grow + intolerable. + +
+ + + Merging and renames + + A surprising number of revision control systems pay little + or no attention to a file's name over + time. For instance, it used to be common that if a file got + renamed on one side of a merge, the changes from the other + side would be silently dropped. + + Mercurial records metadata when you tell it to perform a + rename or copy. It uses this metadata during a merge to do the + right thing in the case of a merge. For instance, if I rename + a file, and you edit it without renaming it, when we merge our + work the file will be renamed and have your edits + applied. + +
+ + + Other interesting design features + + In the sections above, I've tried to highlight some of the + most important aspects of Mercurial's design, to illustrate that + it pays careful attention to reliability and performance. + However, the attention to detail doesn't stop there. There are + a number of other aspects of Mercurial's construction that I + personally find interesting. I'll detail a few of them here, + separate from the big ticket items above, so that + if you're interested, you can gain a better idea of the amount + of thinking that goes into a well-designed system. + + + Clever compression + + When appropriate, Mercurial will store both snapshots and + deltas in compressed form. It does this by always + trying to compress a snapshot or delta, + but only storing the compressed version if it's smaller than + the uncompressed version. + + This means that Mercurial does the right + thing when storing a file whose native form is + compressed, such as a zip archive or a JPEG + image. When these types of files are compressed a second + time, the resulting file is usually bigger than the + once-compressed form, and so Mercurial will store the plain + zip or JPEG. + + Deltas between revisions of a compressed file are usually + larger than snapshots of the file, and Mercurial again does + the right thing in these cases. It finds that + such a delta exceeds the threshold at which it should store a + complete snapshot of the file, so it stores the snapshot, + again saving space compared to a naive delta-only + approach. + + + Network recompression + + When storing revisions on disk, Mercurial uses the + deflate compression algorithm (the same one + used by the popular zip archive format), + which balances good speed with a respectable compression + ratio. However, when transmitting revision data over a + network connection, Mercurial uncompresses the compressed + revision data. + + If the connection is over HTTP, Mercurial recompresses + the entire stream of data using a compression algorithm that + gives a better compression ratio (the Burrows-Wheeler + algorithm from the widely used bzip2 + compression package). This combination of algorithm and + compression of the entire stream (instead of a revision at a + time) substantially reduces the number of bytes to be + transferred, yielding better network performance over most + kinds of network. + + If the connection is over + ssh, Mercurial + doesn't recompress the stream, because + ssh can already do this itself. You can + tell Mercurial to always use ssh's + compression feature by editing the + .hgrc file in your home directory as + follows. + + [ui] +ssh = ssh -C + + + + + Read/write ordering and atomicity + + Appending to files isn't the whole story when + it comes to guaranteeing that a reader won't see a partial + write. If you recall , + revisions in the changelog point to revisions in the manifest, + and revisions in the manifest point to revisions in filelogs. + This hierarchy is deliberate. + + A writer starts a transaction by writing filelog and + manifest data, and doesn't write any changelog data until + those are finished. A reader starts by reading changelog + data, then manifest data, followed by filelog data. + + Since the writer has always finished writing filelog and + manifest data before it writes to the changelog, a reader will + never read a pointer to a partially written manifest revision + from the changelog, and it will never read a pointer to a + partially written filelog revision from the manifest. + + + + Concurrent access + + The read/write ordering and atomicity guarantees mean that + Mercurial never needs to lock a + repository when it's reading data, even if the repository is + being written to while the read is occurring. This has a big + effect on scalability; you can have an arbitrary number of + Mercurial processes safely reading data from a repository + all at once, no matter whether it's being written to or + not. + + The lockless nature of reading means that if you're + sharing a repository on a multi-user system, you don't need to + grant other local users permission to + write to your repository in order for + them to be able to clone it or pull changes from it; they only + need read permission. (This is + not a common feature among revision + control systems, so don't take it for granted! Most require + readers to be able to lock a repository to access it safely, + and this requires write permission on at least one directory, + which of course makes for all kinds of nasty and annoying + security and administrative problems.) + + Mercurial uses locks to ensure that only one process can + write to a repository at a time (the locking mechanism is safe + even over filesystems that are notoriously hostile to locking, + such as NFS). If a repository is locked, a writer will wait + for a while to retry if the repository becomes unlocked, but + if the repository remains locked for too long, the process + attempting to write will time out after a while. This means + that your daily automated scripts won't get stuck forever and + pile up if a system crashes unnoticed, for example. (Yes, the + timeout is configurable, from zero to infinity.) + + + Safe dirstate access + + As with revision data, Mercurial doesn't take a lock to + read the dirstate file; it does acquire a lock to write it. + To avoid the possibility of reading a partially written copy + of the dirstate file, Mercurial writes to a file with a + unique name in the same directory as the dirstate file, then + renames the temporary file atomically to + dirstate. The file named + dirstate is thus guaranteed to be + complete, not partially written. + + + + + Avoiding seeks + + Critical to Mercurial's performance is the avoidance of + seeks of the disk head, since any seek is far more expensive + than even a comparatively large read operation. + + This is why, for example, the dirstate is stored in a + single file. If there were a dirstate file per directory that + Mercurial tracked, the disk would seek once per directory. + Instead, Mercurial reads the entire single dirstate file in + one step. + + Mercurial also uses a copy on write scheme + when cloning a repository on local storage. Instead of + copying every revlog file from the old repository into the new + repository, it makes a hard link, which is a + shorthand way to say these two names point to the same + file. When Mercurial is about to write to one of a + revlog's files, it checks to see if the number of names + pointing at the file is greater than one. If it is, more than + one repository is using the file, so Mercurial makes a new + copy of the file that is private to this repository. + + A few revision control developers have pointed out that + this idea of making a complete private copy of a file is not + very efficient in its use of storage. While this is true, + storage is cheap, and this method gives the highest + performance while deferring most book-keeping to the operating + system. An alternative scheme would most likely reduce + performance and increase the complexity of the software, but + speed and simplicity are key to the feel of + day-to-day use. + + + + Other contents of the dirstate + + Because Mercurial doesn't force you to tell it when you're + modifying a file, it uses the dirstate to store some extra + information so it can determine efficiently whether you have + modified a file. For each file in the working directory, it + stores the time that it last modified the file itself, and the + size of the file at that time. + + When you explicitly hg + add, hg remove, + hg rename or hg copy files, Mercurial updates the + dirstate so that it knows what to do with those files when you + commit. + + The dirstate helps Mercurial to efficiently + check the status of files in a repository. + + + + When Mercurial checks the state of a file in the + working directory, it first checks a file's modification + time against the time in the dirstate that records when + Mercurial last wrote the file. If the last modified time + is the same as the time when Mercurial wrote the file, the + file must not have been modified, so Mercurial does not + need to check any further. + + + If the file's size has changed, the file must have + been modified. If the modification time has changed, but + the size has not, only then does Mercurial need to + actually read the contents of the file to see if it has + changed. + + + + Storing the modification time and size dramatically + reduces the number of read operations that Mercurial needs to + perform when we run commands like hg status. + This results in large performance improvements. + + +
+ + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch04-daily.xml --- a/en/ch04-daily.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,682 +0,0 @@ - - - - - Mercurial in daily use - - - Telling Mercurial which files to track - - Mercurial does not work with files in your repository unless - you tell it to manage them. The hg - status command will tell you which files Mercurial - doesn't know about; it uses a - ? to display such - files. - - To tell Mercurial to track a file, use the hg add command. Once you have added a - file, the entry in the output of hg - status for that file changes from - ? to - A. - - &interaction.daily.files.add; - - After you run a hg commit, - the files that you added before the commit will no longer be - listed in the output of hg - status. The reason for this is that by default, hg status only tells you about - interesting files&emdash;those that you have (for - example) modified, removed, or renamed. If you have a repository - that contains thousands of files, you will rarely want to know - about files that Mercurial is tracking, but that have not - changed. (You can still get this information; we'll return to - this later.) - - Once you add a file, Mercurial doesn't do anything with it - immediately. Instead, it will take a snapshot of the file's - state the next time you perform a commit. It will then continue - to track the changes you make to the file every time you commit, - until you remove the file. - - - Explicit versus implicit file naming - - A useful behavior that Mercurial has is that if you pass - the name of a directory to a command, every Mercurial command - will treat this as I want to operate on every file in - this directory and its subdirectories. - - &interaction.daily.files.add-dir; - - Notice in this example that Mercurial printed - the names of the files it added, whereas it didn't do so when - we added the file named myfile.txt in the - earlier example. - - What's going on is that in the former case, we explicitly - named the file to add on the command line. The assumption - that Mercurial makes in such cases is that we know what we - are doing, and it doesn't print any output. - - However, when we imply the names of - files by giving the name of a directory, Mercurial takes the - extra step of printing the name of each file that it does - something with. This makes it more clear what is happening, - and reduces the likelihood of a silent and nasty surprise. - This behavior is common to most Mercurial commands. - - - - Mercurial tracks files, not directories - - Mercurial does not track directory information. Instead, - it tracks the path to a file. Before creating a file, it - first creates any missing directory components of the path. - After it deletes a file, it then deletes any empty directories - that were in the deleted file's path. This sounds like a - trivial distinction, but it has one minor practical - consequence: it is not possible to represent a completely - empty directory in Mercurial. - - Empty directories are rarely useful, and there are - unintrusive workarounds that you can use to achieve an - appropriate effect. The developers of Mercurial thus felt - that the complexity that would be required to manage empty - directories was not worth the limited benefit this feature - would bring. - - If you need an empty directory in your repository, there - are a few ways to achieve this. One is to create a directory, - then hg add a - hidden file to that directory. On Unix-like - systems, any file name that begins with a period - (.) is treated as hidden by - most commands and GUI tools. This approach is illustrated - below. - -&interaction.daily.files.hidden; - - Another way to tackle a need for an empty directory is to - simply create one in your automated build scripts before they - will need it. - - - - - How to stop tracking a file - - Once you decide that a file no longer belongs in your - repository, use the hg remove - command. This deletes the file, and tells Mercurial to stop - tracking it. A removed file is represented in the output of - hg status with a - R. - - &interaction.daily.files.remove; - - After you hg remove a file, - Mercurial will no longer track changes to that file, even if you - recreate a file with the same name in your working directory. - If you do recreate a file with the same name and want Mercurial - to track the new file, simply hg - add it. Mercurial will know that the newly added - file is not related to the old file of the same name. - - - Removing a file does not affect its history - - It is important to understand that removing a file has - only two effects. - - It removes the current version of the file - from the working directory. - - It stops Mercurial from tracking changes to - the file, from the time of the next commit. - - Removing a file does not in any way - alter the history of the file. - - If you update the working directory to a - changeset that was committed when it was still tracking a file - that you later removed, the file will reappear in the working - directory, with the contents it had when you committed that - changeset. If you then update the working directory to a - later changeset, in which the file had been removed, Mercurial - will once again remove the file from the working - directory. - - - - Missing files - - Mercurial considers a file that you have deleted, but not - used hg remove to delete, to - be missing. A missing file is - represented with ! in the - output of hg status. - Mercurial commands will not generally do anything with missing - files. - - &interaction.daily.files.missing; - - If your repository contains a file that hg status reports as missing, and - you want the file to stay gone, you can run hg remove at any - time later on, to tell Mercurial that you really did mean to - remove the file. - - &interaction.daily.files.remove-after; - - On the other hand, if you deleted the missing file by - accident, give hg revert the - name of the file to recover. It will reappear, in unmodified - form. - - &interaction.daily.files.recover-missing; - - - - Aside: why tell Mercurial explicitly to remove a - file? - - You might wonder why Mercurial requires you to explicitly - tell it that you are deleting a file. Early during the - development of Mercurial, it let you delete a file however you - pleased; Mercurial would notice the absence of the file - automatically when you next ran a hg - commit, and stop tracking the file. In practice, - this made it too easy to accidentally remove a file without - noticing. - - - - Useful shorthand&emdash;adding and removing files in one - step - - Mercurial offers a combination command, hg addremove, that adds untracked - files and marks missing files as removed. - - &interaction.daily.files.addremove; - - The hg commit command - also provides a - option that performs this same add-and-remove, immediately - followed by a commit. - - &interaction.daily.files.commit-addremove; - - - - - Copying files - - Mercurial provides a hg - copy command that lets you make a new copy of a - file. When you copy a file using this command, Mercurial makes - a record of the fact that the new file is a copy of the original - file. It treats these copied files specially when you merge - your work with someone else's. - - - The results of copying during a merge - - What happens during a merge is that changes - follow a copy. To best illustrate what this - means, let's create an example. We'll start with the usual - tiny repository that contains a single file. - - &interaction.daily.copy.init; - - We need to do some work in - parallel, so that we'll have something to merge. So let's - clone our repository. - - &interaction.daily.copy.clone; - - Back in our initial repository, let's use the hg copy command to make a copy of - the first file we created. - - &interaction.daily.copy.copy; - - If we look at the output of the hg - status command afterwards, the copied file looks - just like a normal added file. - - &interaction.daily.copy.status; - - But if we pass the option to hg status, it prints another line of - output: this is the file that our newly-added file was copied - from. - - &interaction.daily.copy.status-copy; - - Now, back in the repository we cloned, let's make a change - in parallel. We'll add a line of content to the original file - that we created. - - &interaction.daily.copy.other; - - Now we have a modified file in this - repository. When we pull the changes from the first - repository, and merge the two heads, Mercurial will propagate - the changes that we made locally to file - into its copy, new-file. - - &interaction.daily.copy.merge; - - - - Why should changes follow copies? - - This behavior&emdash;of changes to a file - propagating out to copies of the file&emdash;might seem - esoteric, but in most cases it's highly desirable. - - First of all, remember that this propagation - only happens when you merge. So if you - hg copy a file, and - subsequently modify the original file during the normal course - of your work, nothing will happen. - - The second thing to know is that modifications will only - propagate across a copy as long as the changeset that you're - merging changes from hasn't yet seen - the copy. - - The reason that Mercurial does this is as follows. Let's - say I make an important bug fix in a source file, and commit - my changes. Meanwhile, you've decided to hg copy the file in your repository, - without knowing about the bug or having seen the fix, and you - have started hacking on your copy of the file. - - If you pulled and merged my changes, and Mercurial - didn't propagate changes across copies, - your new source file would now contain the bug, and unless you - knew to propagate the bug fix by hand, the bug would - remain in your copy of the file. - - By automatically propagating the change that fixed the bug - from the original file to the copy, Mercurial prevents this - class of problem. To my knowledge, Mercurial is the - only revision control system that - propagates changes across copies like this. - - Once your change history has a record that the copy and - subsequent merge occurred, there's usually no further need to - propagate changes from the original file to the copied file, - and that's why Mercurial only propagates changes across copies - at the first merge, and not afterwards. - - - - How to make changes <emphasis>not</emphasis> follow a - copy - - If, for some reason, you decide that this business of - automatically propagating changes across copies is not for - you, simply use your system's normal file copy command (on - Unix-like systems, that's cp) to make a - copy of a file, then hg add - the new copy by hand. Before you do so, though, please do - reread , and make - an informed - decision that this behavior is not appropriate to your - specific case. - - - - Behavior of the <command role="hg-cmd">hg copy</command> - command - - When you use the hg copy - command, Mercurial makes a copy of each source file as it - currently stands in the working directory. This means that if - you make some modifications to a file, then hg copy it without first having - committed those changes, the new copy will also contain the - modifications you have made up until that point. (I find this - behavior a little counterintuitive, which is why I mention it - here.) - - The hg copy - command acts similarly to the Unix cp - command (you can use the hg - cp alias if you prefer). We must supply two or - more arguments, of which the last is treated as the - destination, and all others are - sources. - - If you pass hg copy a - single file as the source, and the destination does not exist, - it creates a new file with that name. - - &interaction.daily.copy.simple; - - If the destination is a directory, Mercurial copies its - sources into that directory. - - &interaction.daily.copy.dir-dest; - - Copying a directory is - recursive, and preserves the directory structure of the - source. - - &interaction.daily.copy.dir-src; - - If the source and destination are both directories, the - source tree is recreated in the destination directory. - - &interaction.daily.copy.dir-src-dest; - - As with the hg remove - command, if you copy a file manually and then want Mercurial - to know that you've copied the file, simply use the option to hg copy. - - &interaction.daily.copy.after; - - - - - Renaming files - - It's rather more common to need to rename a file than to - make a copy of it. The reason I discussed the hg copy command before talking about - renaming files is that Mercurial treats a rename in essentially - the same way as a copy. Therefore, knowing what Mercurial does - when you copy a file tells you what to expect when you rename a - file. - - When you use the hg rename - command, Mercurial makes a copy of each source file, then - deletes it and marks the file as removed. - - &interaction.daily.rename.rename; - - The hg status command shows - the newly copied file as added, and the copied-from file as - removed. - - &interaction.daily.rename.status; - - As with the results of a hg - copy, we must use the option to hg status to see that the added file - is really being tracked by Mercurial as a copy of the original, - now removed, file. - - &interaction.daily.rename.status-copy; - - As with hg remove and - hg copy, you can tell Mercurial - about a rename after the fact using the option. In most other - respects, the behavior of the hg - rename command, and the options it accepts, are - similar to the hg copy - command. - - If you're familiar with the Unix command line, you'll be - glad to know that hg rename - command can be invoked as hg - mv. - - - Renaming files and merging changes - - Since Mercurial's rename is implemented as - copy-and-remove, the same propagation of changes happens when - you merge after a rename as after a copy. - - If I modify a file, and you rename it to a new name, and - then we merge our respective changes, my modifications to the - file under its original name will be propagated into the file - under its new name. (This is something you might expect to - simply work, but not all revision control - systems actually do this.) - - Whereas having changes follow a copy is a feature where - you can perhaps nod and say yes, that might be - useful, it should be clear that having them follow a - rename is definitely important. Without this facility, it - would simply be too easy for changes to become orphaned when - files are renamed. - - - - Divergent renames and merging - - The case of diverging names occurs when two developers - start with a file&emdash;let's call it - foo&emdash;in their respective - repositories. - - &interaction.rename.divergent.clone; - - Anne renames the file to bar. - - &interaction.rename.divergent.rename.anne; - - Meanwhile, Bob renames it to - quux. (Remember that hg mv is an alias for hg rename.) - - &interaction.rename.divergent.rename.bob; - - I like to think of this as a conflict because each - developer has expressed different intentions about what the - file ought to be named. - - What do you think should happen when they merge their - work? Mercurial's actual behavior is that it always preserves - both names when it merges changesets that - contain divergent renames. - - &interaction.rename.divergent.merge; - - Notice that while Mercurial warns about the divergent - renames, it leaves it up to you to do something about the - divergence after the merge. - - - - Convergent renames and merging - - Another kind of rename conflict occurs when two people - choose to rename different source files - to the same destination. In this case, - Mercurial runs its normal merge machinery, and lets you guide - it to a suitable resolution. - - - - Other name-related corner cases - - Mercurial has a longstanding bug in which it fails to - handle a merge where one side has a file with a given name, - while another has a directory with the same name. This is - documented as issue - 29. - - &interaction.issue29.go; - - - - - - Recovering from mistakes - - Mercurial has some useful commands that will help you to - recover from some common mistakes. - - The hg revert command lets - you undo changes that you have made to your working directory. - For example, if you hg add a - file by accident, just run hg - revert with the name of the file you added, and - while the file won't be touched in any way, it won't be tracked - for adding by Mercurial any longer, either. You can also use - hg revert to get rid of - erroneous changes to a file. - - It's good to remember that the hg - revert command is useful for changes that you have - not yet committed. Once you've committed a change, if you - decide it was a mistake, you can still do something about it, - though your options may be more limited. - - For more information about the hg revert command, and details about - how to deal with changes you have already committed, see . - - - - Dealing with tricky merges - - In a complicated or large project, it's not unusual for a - merge of two changesets to result in some headaches. Suppose - there's a big source file that's been extensively edited by each - side of a merge: this is almost inevitably going to result in - conflicts, some of which can take a few tries to sort - out. - - Let's develop a simple case of this and see how to deal with - it. We'll start off with a repository containing one file, and - clone it twice. - - &interaction.ch04-resolve.init; - - In one clone, we'll modify the file in one way. - - &interaction.ch04-resolve.left; - - In another, we'll modify the file differently. - - &interaction.ch04-resolve.right; - - Next, we'll pull each set of changes into our original - repo. - - &interaction.ch04-resolve.pull; - - We expect our repository to now contain two heads. - - &interaction.ch04-resolve.heads; - - Normally, if we run hg - merge at this point, it will drop us into a GUI that - will let us manually resolve the conflicting edits to - myfile.txt. However, to simplify things - for presentation here, we'd like the merge to fail immediately - instead. Here's one way we can do so. - - &interaction.ch04-resolve.export; - - We've told Mercurial's merge machinery to run the command - false (which, as we desire, fails - immediately) if it detects a merge that it can't sort out - automatically. - - If we now fire up hg - merge, it should grind to a halt and report a - failure. - - &interaction.ch04-resolve.merge; - - Even if we don't notice that the merge failed, Mercurial - will prevent us from accidentally committing the result of a - failed merge. - - &interaction.ch04-resolve.cifail; - - When hg commit fails in - this case, it suggests that we use the unfamiliar hg resolve command. As usual, - hg help resolve will print a - helpful synopsis. - - - File resolution states - - When a merge occurs, most files will usually remain - unmodified. For each file where Mercurial has to do - something, it tracks the state of the file. - - - - A resolved file has been - successfully merged, either automatically by Mercurial or - manually with human intervention. - - - An unresolved file was not merged - successfully, and needs more attention. - - - - If Mercurial sees any file in the - unresolved state after a merge, it considers the merge to have - failed. Fortunately, we do not need to restart the entire - merge from scratch. - - The or - option to hg resolve prints out the state of - each merged file. - - &interaction.ch04-resolve.list; - - In the output from hg - resolve, a resolved file is marked with - R, while an unresolved file is marked with - U. If any files are listed with - U, we know that an attempt to commit the - results of the merge will fail. - - - - Resolving a file merge - - We have several options to move a file from the unresolved - into the resolved state. By far the most common is to rerun - hg resolve. If we pass the - names of individual files or directories, it will retry the - merges of any unresolved files present in those locations. We - can also pass the - or option, which - will retry the merges of all unresolved - files. - - Mercurial also lets us modify the resolution state of a - file directly. We can manually mark a file as resolved using - the option, or - as unresolved using the option. This allows - us to clean up a particularly messy merge by hand, and to keep - track of our progress with each file as we go. - - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch05-collab.xml --- a/en/ch05-collab.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1575 +0,0 @@ - - - - - Collaborating with other people - - As a completely decentralised tool, Mercurial doesn't impose - any policy on how people ought to work with each other. However, - if you're new to distributed revision control, it helps to have - some tools and examples in mind when you're thinking about - possible workflow models. - - - Mercurial's web interface - - Mercurial has a powerful web interface that provides several - useful capabilities. - - For interactive use, the web interface lets you browse a - single repository or a collection of repositories. You can view - the history of a repository, examine each change (comments and - diffs), and view the contents of each directory and file. You - can even get a view of history that gives a graphical view of - the relationships between individual changes and merges. - - Also for human consumption, the web interface provides - Atom and RSS feeds of the changes in a repository. This lets you - subscribe to a repository using your favorite - feed reader, and be automatically notified of activity in that - repository as soon as it happens. I find this capability much - more convenient than the model of subscribing to a mailing list - to which notifications are sent, as it requires no additional - configuration on the part of whoever is serving the - repository. - - The web interface also lets remote users clone a repository, - pull changes from it, and (when the server is configured to - permit it) push changes back to it. Mercurial's HTTP tunneling - protocol aggressively compresses data, so that it works - efficiently even over low-bandwidth network connections. - - The easiest way to get started with the web interface is to - use your web browser to visit an existing repository, such as - the master Mercurial repository at http://www.selenic.com/repo/hg. - - If you're interested in providing a web interface - to your own repositories, there are several good ways to do - this. - - The easiest and fastest way to get started in an informal - environment is to use the hg - serve command, which is best suited to short-term - lightweight serving. See below for details of how to use - this command. - - For longer-lived repositories that you'd like to have - permanently available, there are several public hosting services - available. - - - - Bitbucket, at http://bitbucket.org/, - provides free hosting for open source projects, and paid - hosting for commercial projects. - - - - If you would prefer to host your own repositories, Mercurial - has built-in support for several popular hosting technologies, - most notably CGI (Common Gateway Interface), and WSGI (Web - Services Gateway Interface). See for details of CGI and WSGI - configuration. - - - - Collaboration models - - With a suitably flexible tool, making decisions about - workflow is much more of a social engineering challenge than a - technical one. Mercurial imposes few limitations on how you can - structure the flow of work in a project, so it's up to you and - your group to set up and live with a model that matches your own - particular needs. - - - Factors to keep in mind - - The most important aspect of any model that you must keep - in mind is how well it matches the needs and capabilities of - the people who will be using it. This might seem - self-evident; even so, you still can't afford to forget it for - a moment. - - I once put together a workflow model that seemed to make - perfect sense to me, but that caused a considerable amount of - consternation and strife within my development team. In spite - of my attempts to explain why we needed a complex set of - branches, and how changes ought to flow between them, a few - team members revolted. Even though they were smart people, - they didn't want to pay attention to the constraints we were - operating under, or face the consequences of those constraints - in the details of the model that I was advocating. - - Don't sweep foreseeable social or technical problems under - the rug. Whatever scheme you put into effect, you should plan - for mistakes and problem scenarios. Consider adding automated - machinery to prevent, or quickly recover from, trouble that - you can anticipate. As an example, if you intend to have a - branch with not-for-release changes in it, you'd do well to - think early about the possibility that someone might - accidentally merge those changes into a release branch. You - could avoid this particular problem by writing a hook that - prevents changes from being merged from an inappropriate - branch. - - - - Informal anarchy - - I wouldn't suggest an anything goes - approach as something sustainable, but it's a model that's - easy to grasp, and it works perfectly well in a few unusual - situations. - - As one example, many projects have a loose-knit group of - collaborators who rarely physically meet each other. Some - groups like to overcome the isolation of working at a distance - by organizing occasional sprints. In a sprint, - a number of people get together in a single location (a - company's conference room, a hotel meeting room, that kind of - place) and spend several days more or less locked in there, - hacking intensely on a handful of projects. - - A sprint or a hacking session in a coffee shop are the perfect places to use the - hg serve command, since - hg serve does not require any - fancy server infrastructure. You can get started with - hg serve in moments, by - reading below. Then simply - tell the person next to you that you're running a server, send - the URL to them in an instant message, and you immediately - have a quick-turnaround way to work together. They can type - your URL into their web browser and quickly review your - changes; or they can pull a bugfix from you and verify it; or - they can clone a branch containing a new feature and try it - out. - - The charm, and the problem, with doing things - in an ad hoc fashion like this is that only people who know - about your changes, and where they are, can see them. Such an - informal approach simply doesn't scale beyond a handful - people, because each individual needs to know about - n different repositories to pull - from. - - - - A single central repository - - For smaller projects migrating from a centralised revision - control tool, perhaps the easiest way to get started is to - have changes flow through a single shared central repository. - This is also the most common building block for - more ambitious workflow schemes. - - Contributors start by cloning a copy of this repository. - They can pull changes from it whenever they need to, and some - (perhaps all) developers have permission to push a change back - when they're ready for other people to see it. - - Under this model, it can still often make sense for people - to pull changes directly from each other, without going - through the central repository. Consider a case in which I - have a tentative bug fix, but I am worried that if I were to - publish it to the central repository, it might subsequently - break everyone else's trees as they pull it. To reduce the - potential for damage, I can ask you to clone my repository - into a temporary repository of your own and test it. This - lets us put off publishing the potentially unsafe change until - it has had a little testing. - - If a team is hosting its own repository in this - kind of scenario, people will usually use the - ssh protocol to securely push changes to - the central repository, as documented in . It's also usual to publish a - read-only copy of the repository over HTTP, as in - . Publishing over HTTP - satisfies the needs of people who don't have push access, and - those who want to use web browsers to browse the repository's - history. - - - - A hosted central repository - - A wonderful thing about public hosting services like - Bitbucket is that - not only do they handle the fiddly server configuration - details, such as user accounts, authentication, and secure - wire protocols, they provide additional infrastructure to make - this model work well. - - For instance, a well-engineered hosting service will let - people clone their own copies of a repository with a single - click. This lets people work in separate spaces and share - their changes when they're ready. - - In addition, a good hosting service will let people - communicate with each other, for instance to say there - are changes ready for you to review in this - tree. - - - - Working with multiple branches - - Projects of any significant size naturally tend to make - progress on several fronts simultaneously. In the case of - software, it's common for a project to go through periodic - official releases. A release might then go into - maintenance mode for a while after its first - publication; maintenance releases tend to contain only bug - fixes, not new features. In parallel with these maintenance - releases, one or more future releases may be under - development. People normally use the word - branch to refer to one of these many slightly - different directions in which development is - proceeding. - - Mercurial is particularly well suited to managing a number - of simultaneous, but not identical, branches. Each - development direction can live in its own - central repository, and you can merge changes from one to - another as the need arises. Because repositories are - independent of each other, unstable changes in a development - branch will never affect a stable branch unless someone - explicitly merges those changes into the stable branch. - - Here's an example of how this can work in practice. Let's - say you have one main branch on a central - server. - - &interaction.branching.init; - - People clone it, make changes locally, test them, and push - them back. - - Once the main branch reaches a release milestone, you can - use the hg tag command to - give a permanent name to the milestone revision. - - &interaction.branching.tag; - - Let's say some ongoing - development occurs on the main branch. - - &interaction.branching.main; - - Using the tag that was recorded at the milestone, people - who clone that repository at any time in the future can use - hg update to get a copy of - the working directory exactly as it was when that tagged - revision was committed. - - &interaction.branching.update; - - In addition, immediately after the main branch is tagged, - we can then clone the main branch on the server to a new - stable branch, also on the server. - - &interaction.branching.clone; - - If we need to make a change to the stable - branch, we can then clone that - repository, make our changes, commit, and push our changes - back there. - - &interaction.branching.stable; - - Because Mercurial repositories are independent, and - Mercurial doesn't move changes around automatically, the - stable and main branches are isolated - from each other. The changes that we made on the main branch - don't leak to the stable branch, and vice - versa. - - We'll often want all of our bugfixes on the stable - branch to show up on the main branch, too. Rather than - rewrite a bugfix on the main branch, we can simply pull and - merge changes from the stable to the main branch, and - Mercurial will bring those bugfixes in for us. - - &interaction.branching.merge; - - The main branch will still contain changes that - are not on the stable branch, but it will also contain all of - the bugfixes from the stable branch. The stable branch - remains unaffected by these changes, since changes are only - flowing from the stable to the main branch, and not the other - way. - - - - Feature branches - - For larger projects, an effective way to manage change is - to break up a team into smaller groups. Each group has a - shared branch of its own, cloned from a single - master branch used by the entire project. - People working on an individual branch are typically quite - isolated from developments on other branches. - -
- Feature branches - - - XXX add text - -
- - When a particular feature is deemed to be in suitable - shape, someone on that feature team pulls and merges from the - master branch into the feature branch, then pushes back up to - the master branch. -
- - - The release train - - Some projects are organized on a train - basis: a release is scheduled to happen every few months, and - whatever features are ready when the train is - ready to leave are allowed in. - - This model resembles working with feature branches. The - difference is that when a feature branch misses a train, - someone on the feature team pulls and merges the changes that - went out on that train release into the feature branch, and - the team continues its work on top of that release so that - their feature can make the next release. - - - - The Linux kernel model - - The development of the Linux kernel has a shallow - hierarchical structure, surrounded by a cloud of apparent - chaos. Because most Linux developers use - git, a distributed revision control tool - with capabilities similar to Mercurial, it's useful to - describe the way work flows in that environment; if you like - the ideas, the approach translates well across tools. - - At the center of the community sits Linus Torvalds, the - creator of Linux. He publishes a single source repository - that is considered the authoritative current - tree by the entire developer community. Anyone can clone - Linus's tree, but he is very choosy about whose trees he pulls - from. - - Linus has a number of trusted lieutenants. - As a general rule, he pulls whatever changes they publish, in - most cases without even reviewing those changes. Some of - those lieutenants are generally agreed to be - maintainers, responsible for specific - subsystems within the kernel. If a random kernel hacker wants - to make a change to a subsystem that they want to end up in - Linus's tree, they must find out who the subsystem's - maintainer is, and ask that maintainer to take their change. - If the maintainer reviews their changes and agrees to take - them, they'll pass them along to Linus in due course. - - Individual lieutenants have their own approaches to - reviewing, accepting, and publishing changes; and for deciding - when to feed them to Linus. In addition, there are several - well known branches that people use for different purposes. - For example, a few people maintain stable - repositories of older versions of the kernel, to which they - apply critical fixes as needed. Some maintainers publish - multiple trees: one for experimental changes; one for changes - that they are about to feed upstream; and so on. Others just - publish a single tree. - - This model has two notable features. The first is that - it's pull only. You have to ask, convince, or - beg another developer to take a change from you, because there - are almost no trees to which more than one person can push, - and there's no way to push changes into a tree that someone - else controls. - - The second is that it's based on reputation and acclaim. - If you're an unknown, Linus will probably ignore changes from - you without even responding. But a subsystem maintainer will - probably review them, and will likely take them if they pass - their criteria for suitability. The more good - changes you contribute to a maintainer, the more likely they - are to trust your judgment and accept your changes. If you're - well-known and maintain a long-lived branch for something - Linus hasn't yet accepted, people with similar interests may - pull your changes regularly to keep up with your work. - - Reputation and acclaim don't necessarily cross subsystem - or people boundaries. If you're a respected - but specialised storage hacker, and you try to fix a - networking bug, that change will receive a level of scrutiny - from a network maintainer comparable to a change from a - complete stranger. - - To people who come from more orderly project backgrounds, - the comparatively chaotic Linux kernel development process - often seems completely insane. It's subject to the whims of - individuals; people make sweeping changes whenever they deem - it appropriate; and the pace of development is astounding. - And yet Linux is a highly successful, well-regarded piece of - software. - - - - Pull-only versus shared-push collaboration - - A perpetual source of heat in the open source community is - whether a development model in which people only ever pull - changes from others is better than one in which - multiple people can push changes to a shared - repository. - - Typically, the backers of the shared-push model use tools - that actively enforce this approach. If you're using a - centralised revision control tool such as Subversion, there's - no way to make a choice over which model you'll use: the tool - gives you shared-push, and if you want to do anything else, - you'll have to roll your own approach on top (such as applying - a patch by hand). - - A good distributed revision control tool will - support both models. You and your collaborators can then - structure how you work together based on your own needs and - preferences, not on what contortions your tools force you - into. - - - Where collaboration meets branch management - - Once you and your team set up some shared - repositories and start propagating changes back and forth - between local and shared repos, you begin to face a related, - but slightly different challenge: that of managing the - multiple directions in which your team may be moving at once. - Even though this subject is intimately related to how your - team collaborates, it's dense enough to merit treatment of its - own, in . - -
- - - The technical side of sharing - - The remainder of this chapter is devoted to the question of - sharing changes with your collaborators. - - - - Informal sharing with <command role="hg-cmd">hg - serve</command> - - Mercurial's hg serve - command is wonderfully suited to small, tight-knit, and - fast-paced group environments. It also provides a great way to - get a feel for using Mercurial commands over a network. - - Run hg serve inside a - repository, and in under a second it will bring up a specialised - HTTP server; this will accept connections from any client, and - serve up data for that repository until you terminate it. - Anyone who knows the URL of the server you just started, and can - talk to your computer over the network, can then use a web - browser or Mercurial to read data from that repository. A URL - for a hg serve instance running - on a laptop is likely to look something like - http://my-laptop.local:8000/. - - The hg serve command is - not a general-purpose web server. It can do - only two things: - - Allow people to browse the history of the - repository it's serving, from their normal web - browsers. - - Speak Mercurial's wire protocol, so that people - can hg clone or hg pull changes from that - repository. - - In particular, hg serve - won't allow remote users to modify your - repository. It's intended for read-only use. - - If you're getting started with Mercurial, there's nothing to - prevent you from using hg serve - to serve up a repository on your own computer, then use commands - like hg clone, hg incoming, and so on to talk to that - server as if the repository was hosted remotely. This can help - you to quickly get acquainted with using commands on - network-hosted repositories. - - - A few things to keep in mind - - Because it provides unauthenticated read access to all - clients, you should only use hg - serve in an environment where you either don't - care, or have complete control over, who can access your - network and pull data from your repository. - - The hg serve command - knows nothing about any firewall software you might have - installed on your system or network. It cannot detect or - control your firewall software. If other people are unable to - talk to a running hg serve - instance, the second thing you should do - (after you make sure that they're using - the correct URL) is check your firewall configuration. - - By default, hg serve - listens for incoming connections on port 8000. If another - process is already listening on the port you want to use, you - can specify a different port to listen on using the option. - - Normally, when hg serve - starts, it prints no output, which can be a bit unnerving. If - you'd like to confirm that it is indeed running correctly, and - find out what URL you should send to your collaborators, start - it with the - option. - - - - - Using the Secure Shell (ssh) protocol - - You can pull and push changes securely over a network - connection using the Secure Shell (ssh) - protocol. To use this successfully, you may have to do a little - bit of configuration on the client or server sides. - - If you're not familiar with ssh, it's the name of - both a command and a network protocol that let you securely - communicate with another computer. To use it with Mercurial, - you'll be setting up one or more user accounts on a server so - that remote users can log in and execute commands. - - (If you are familiar with ssh, you'll - probably find some of the material that follows to be elementary - in nature.) - - - How to read and write ssh URLs - - An ssh URL tends to look like this: - ssh://bos@hg.serpentine.com:22/hg/hgbook - - The ssh:// - part tells Mercurial to use the ssh protocol. - - The bos@ - component indicates what username to log into the server - as. You can leave this out if the remote username is the - same as your local username. - - The - hg.serpentine.com gives - the hostname of the server to log into. - - The :22 identifies the port - number to connect to the server on. The default port is - 22, so you only need to specify a colon and port number if - you're not using port 22. - - The remainder of the URL is the local path to - the repository on the server. - - - There's plenty of scope for confusion with the path - component of ssh URLs, as there is no standard way for tools - to interpret it. Some programs behave differently than others - when dealing with these paths. This isn't an ideal situation, - but it's unlikely to change. Please read the following - paragraphs carefully. - - Mercurial treats the path to a repository on the server as - relative to the remote user's home directory. For example, if - user foo on the server has a home directory - of /home/foo, then an - ssh URL that contains a path component of bar really - refers to the directory /home/foo/bar. - - If you want to specify a path relative to another user's - home directory, you can use a path that starts with a tilde - character followed by the user's name (let's call them - otheruser), like this. - ssh://server/~otheruser/hg/repo - - And if you really want to specify an - absolute path on the server, begin the - path component with two slashes, as in this example. - ssh://server//absolute/path - - - - Finding an ssh client for your system - - Almost every Unix-like system comes with OpenSSH - preinstalled. If you're using such a system, run - which ssh to find out if the - ssh command is installed (it's usually in - /usr/bin). In the - unlikely event that it isn't present, take a look at your - system documentation to figure out how to install it. - - On Windows, the TortoiseHg package is bundled - with a version of Simon Tatham's excellent - plink command, and you should not need to - do any further configuration. - - - - Generating a key pair - - To avoid the need to repetitively type a - password every time you need to use your ssh client, I - recommend generating a key pair. - - - Key pairs are not mandatory - - Mercurial knows nothing about ssh authentication or key - pairs. You can, if you like, safely ignore this section and - the one that follows until you grow tired of repeatedly - typing ssh passwords. - - - - - On a Unix-like system, the - ssh-keygen command will do the - trick. - On Windows, if you're using TortoiseHg, you may need - to download a command named puttygen - from the - PuTTY web site to generate a key pair. See - the - puttygen documentation for - details of how use the command. - - - - When you generate a key pair, it's usually - highly advisable to protect it with a - passphrase. (The only time that you might not want to do this - is when you're using the ssh protocol for automated tasks on a - secure network.) - - Simply generating a key pair isn't enough, however. - You'll need to add the public key to the set of authorised - keys for whatever user you're logging in remotely as. For - servers using OpenSSH (the vast majority), this will mean - adding the public key to a list in a file called authorized_keys in their .ssh - directory. - - On a Unix-like system, your public key will have a - .pub extension. If you're using - puttygen on Windows, you can save the - public key to a file of your choosing, or paste it from the - window it's displayed in straight into the authorized_keys file. - - - Using an authentication agent - - An authentication agent is a daemon that stores - passphrases in memory (so it will forget passphrases if you - log out and log back in again). An ssh client will notice if - it's running, and query it for a passphrase. If there's no - authentication agent running, or the agent doesn't store the - necessary passphrase, you'll have to type your passphrase - every time Mercurial tries to communicate with a server on - your behalf (e.g. whenever you pull or push changes). - - The downside of storing passphrases in an agent is that - it's possible for a well-prepared attacker to recover the - plain text of your passphrases, in some cases even if your - system has been power-cycled. You should make your own - judgment as to whether this is an acceptable risk. It - certainly saves a lot of repeated typing. - - - - On Unix-like systems, the agent is called - ssh-agent, and it's often run - automatically for you when you log in. You'll need to use - the ssh-add command to add passphrases - to the agent's store. - - - On Windows, if you're using TortoiseHg, the - pageant command acts as the agent. As - with puttygen, you'll need to download - pageant from the PuTTY web - site and read its - documentation. The pageant - command adds an icon to your system tray that will let you - manage stored passphrases. - - - - - - Configuring the server side properly - - Because ssh can be fiddly to set up if you're new to it, - a variety of things can go wrong. Add Mercurial - on top, and there's plenty more scope for head-scratching. - Most of these potential problems occur on the server side, not - the client side. The good news is that once you've gotten a - configuration working, it will usually continue to work - indefinitely. - - Before you try using Mercurial to talk to an ssh server, - it's best to make sure that you can use the normal - ssh or putty command to - talk to the server first. If you run into problems with using - these commands directly, Mercurial surely won't work. Worse, - it will obscure the underlying problem. Any time you want to - debug ssh-related Mercurial problems, you should drop back to - making sure that plain ssh client commands work first, - before you worry about whether there's a - problem with Mercurial. - - The first thing to be sure of on the server side is that - you can actually log in from another machine at all. If you - can't use ssh or putty - to log in, the error message you get may give you a few hints - as to what's wrong. The most common problems are as - follows. - - If you get a connection refused - error, either there isn't an SSH daemon running on the - server at all, or it's inaccessible due to firewall - configuration. - - If you get a no route to host - error, you either have an incorrect address for the server - or a seriously locked down firewall that won't admit its - existence at all. - - If you get a permission denied - error, you may have mistyped the username on the server, - or you could have mistyped your key's passphrase or the - remote user's password. - - In summary, if you're having trouble talking to the - server's ssh daemon, first make sure that one is running at - all. On many systems it will be installed, but disabled, by - default. Once you're done with this step, you should then - check that the server's firewall is configured to allow - incoming connections on the port the ssh daemon is listening - on (usually 22). Don't worry about more exotic possibilities - for misconfiguration until you've checked these two - first. - - If you're using an authentication agent on the client side - to store passphrases for your keys, you ought to be able to - log into the server without being prompted for a passphrase or - a password. If you're prompted for a passphrase, there are a - few possible culprits. - - You might have forgotten to use - ssh-add or pageant - to store the passphrase. - - You might have stored the passphrase for the - wrong key. - - If you're being prompted for the remote user's password, - there are another few possible problems to check. - - Either the user's home directory or their - .ssh - directory might have excessively liberal permissions. As - a result, the ssh daemon will not trust or read their - authorized_keys file. - For example, a group-writable home or .ssh - directory will often cause this symptom. - - The user's authorized_keys file may have - a problem. If anyone other than the user owns or can write - to that file, the ssh daemon will not trust or read - it. - - - In the ideal world, you should be able to run the - following command successfully, and it should print exactly - one line of output, the current date and time. - ssh myserver date - - If, on your server, you have login scripts that print - banners or other junk even when running non-interactive - commands like this, you should fix them before you continue, - so that they only print output if they're run interactively. - Otherwise these banners will at least clutter up Mercurial's - output. Worse, they could potentially cause problems with - running Mercurial commands remotely. Mercurial makes tries to - detect and ignore banners in non-interactive - ssh sessions, but it is not foolproof. (If - you're editing your login scripts on your server, the usual - way to see if a login script is running in an interactive - shell is to check the return code from the command - tty -s.) - - Once you've verified that plain old ssh is working with - your server, the next step is to ensure that Mercurial runs on - the server. The following command should run - successfully: - - ssh myserver hg version - - If you see an error message instead of normal hg version output, this is usually - because you haven't installed Mercurial to /usr/bin. Don't worry if this - is the case; you don't need to do that. But you should check - for a few possible problems. - - Is Mercurial really installed on the server at - all? I know this sounds trivial, but it's worth - checking! - - Maybe your shell's search path (usually set - via the PATH environment variable) is - simply misconfigured. - - Perhaps your PATH environment - variable is only being set to point to the location of the - hg executable if the login session is - interactive. This can happen if you're setting the path - in the wrong shell login script. See your shell's - documentation for details. - - The PYTHONPATH environment - variable may need to contain the path to the Mercurial - Python modules. It might not be set at all; it could be - incorrect; or it may be set only if the login is - interactive. - - - If you can run hg version - over an ssh connection, well done! You've got the server and - client sorted out. You should now be able to use Mercurial to - access repositories hosted by that username on that server. - If you run into problems with Mercurial and ssh at this point, - try using the - option to get a clearer picture of what's going on. - - - Using compression with ssh - - Mercurial does not compress data when it uses the ssh - protocol, because the ssh protocol can transparently compress - data. However, the default behavior of ssh clients is - not to request compression. - - Over any network other than a fast LAN (even a wireless - network), using compression is likely to significantly speed - up Mercurial's network operations. For example, over a WAN, - someone measured compression as reducing the amount of time - required to clone a particularly large repository from 51 - minutes to 17 minutes. - - Both ssh and plink - accept a option which - turns on compression. You can easily edit your ~/.hgrc to enable compression for - all of Mercurial's uses of the ssh protocol. Here is how to - do so for regular ssh on Unix-like systems, - for example. - [ui] -ssh = ssh -C - - If you use ssh on a - Unix-like system, you can configure it to always use - compression when talking to your server. To do this, edit - your .ssh/config file - (which may not yet exist), as follows. - - Host hg - Compression yes - HostName hg.example.com - - This defines a hostname alias, - hg. When you use that hostname on the - ssh command line or in a Mercurial - ssh-protocol URL, it will cause - ssh to connect to - hg.example.com and use compression. This - gives you both a shorter name to type and compression, each of - which is a good thing in its own right. - - - - - Serving over HTTP using CGI - - The simplest way to host one or more repositories in a - permanent way is to use a web server and Mercurial's CGI - support. - - Depending on how ambitious you are, configuring Mercurial's - CGI interface can take anything from a few moments to several - hours. - - We'll begin with the simplest of examples, and work our way - towards a more complex configuration. Even for the most basic - case, you're almost certainly going to need to read and modify - your web server's configuration. - - - High pain tolerance required - - Configuring a web server is a complex, fiddly, - and highly system-dependent activity. I can't possibly give - you instructions that will cover anything like all of the - cases you will encounter. Please use your discretion and - judgment in following the sections below. Be prepared to make - plenty of mistakes, and to spend a lot of time reading your - server's error logs. - - If you don't have a strong stomach for tweaking - configurations over and over, or a compelling need to host - your own services, you might want to try one of the public - hosting services that I mentioned earlier. - - - - Web server configuration checklist - - Before you continue, do take a few moments to check a few - aspects of your system's setup. - - - Do you have a web server installed - at all? Mac OS X and some Linux distributions ship with - Apache, but many other systems may not have a web server - installed. - - If you have a web server installed, is it - actually running? On most systems, even if one is - present, it will be disabled by default. - - Is your server configured to allow you to run - CGI programs in the directory where you plan to do so? - Most servers default to explicitly disabling the ability - to run CGI programs. - - - If you don't have a web server installed, and don't have - substantial experience configuring Apache, you should consider - using the lighttpd web server instead of - Apache. Apache has a well-deserved reputation for baroque and - confusing configuration. While lighttpd is - less capable in some ways than Apache, most of these - capabilities are not relevant to serving Mercurial - repositories. And lighttpd is undeniably - much easier to get started with than - Apache. - - - - Basic CGI configuration - - On Unix-like systems, it's common for users to have a - subdirectory named something like public_html in their home - directory, from which they can serve up web pages. A file - named foo in this directory will be - accessible at a URL of the form - http://www.example.com/username/foo. - - To get started, find the hgweb.cgi script that should be - present in your Mercurial installation. If you can't quickly - find a local copy on your system, simply download one from the - master Mercurial repository at http://www.selenic.com/repo/hg/raw-file/tip/hgweb.cgi. - - You'll need to copy this script into your public_html directory, and - ensure that it's executable. - cp .../hgweb.cgi ~/public_html -chmod 755 ~/public_html/hgweb.cgi - The 755 argument to - chmod is a little more general than just - making the script executable: it ensures that the script is - executable by anyone, and that group and - other write permissions are - not set. If you were to leave those - write permissions enabled, Apache's suexec - subsystem would likely refuse to execute the script. In fact, - suexec also insists that the - directory in which the script resides - must not be writable by others. - chmod 755 ~/public_html - - - What could <emphasis>possibly</emphasis> go - wrong? - - Once you've copied the CGI script into place, go into a - web browser, and try to open the URL http://myhostname/ - myuser/hgweb.cgi, but brace - yourself for instant failure. There's a high probability - that trying to visit this URL will fail, and there are many - possible reasons for this. In fact, you're likely to - stumble over almost every one of the possible errors below, - so please read carefully. The following are all of the - problems I ran into on a system running Fedora 7, with a - fresh installation of Apache, and a user account that I - created specially to perform this exercise. - - Your web server may have per-user directories disabled. - If you're using Apache, search your config file for a - UserDir directive. If there's none - present, per-user directories will be disabled. If one - exists, but its value is disabled, then - per-user directories will be disabled. Otherwise, the - string after UserDir gives the name of - the subdirectory that Apache will look in under your home - directory, for example public_html. - - Your file access permissions may be too restrictive. - The web server must be able to traverse your home directory - and directories under your public_html directory, and - read files under the latter too. Here's a quick recipe to - help you to make your permissions more appropriate. - chmod 755 ~ -find ~/public_html -type d -print0 | xargs -0r chmod 755 -find ~/public_html -type f -print0 | xargs -0r chmod 644 - - The other possibility with permissions is that you might - get a completely empty window when you try to load the - script. In this case, it's likely that your access - permissions are too permissive. Apache's - suexec subsystem won't execute a script - that's group- or world-writable, for example. - - Your web server may be configured to disallow execution - of CGI programs in your per-user web directory. Here's - Apache's default per-user configuration from my Fedora - system. - - &ch06-apache-config.lst; - - If you find a similar-looking - Directory group in your Apache - configuration, the directive to look at inside it is - Options. Add ExecCGI - to the end of this list if it's missing, and restart the web - server. - - If you find that Apache serves you the text of the CGI - script instead of executing it, you may need to either - uncomment (if already present) or add a directive like - this. - AddHandler cgi-script .cgi - - The next possibility is that you might be served with a - colourful Python backtrace claiming that it can't import a - mercurial-related module. This is - actually progress! The server is now capable of executing - your CGI script. This error is only likely to occur if - you're running a private installation of Mercurial, instead - of a system-wide version. Remember that the web server runs - the CGI program without any of the environment variables - that you take for granted in an interactive session. If - this error happens to you, edit your copy of hgweb.cgi and follow the - directions inside it to correctly set your - PYTHONPATH environment variable. - - Finally, you are certain to by - served with another colourful Python backtrace: this one - will complain that it can't find /path/to/repository. Edit - your hgweb.cgi script - and replace the /path/to/repository string - with the complete path to the repository you want to serve - up. - - At this point, when you try to reload the page, you - should be presented with a nice HTML view of your - repository's history. Whew! - - - - Configuring lighttpd - - To be exhaustive in my experiments, I tried configuring - the increasingly popular lighttpd web - server to serve the same repository as I described with - Apache above. I had already overcome all of the problems I - outlined with Apache, many of which are not server-specific. - As a result, I was fairly sure that my file and directory - permissions were good, and that my hgweb.cgi script was properly - edited. - - Once I had Apache running, getting - lighttpd to serve the repository was a - snap (in other words, even if you're trying to use - lighttpd, you should read the Apache - section). I first had to edit the - mod_access section of its config file to - enable mod_cgi and - mod_userdir, both of which were disabled - by default on my system. I then added a few lines to the - end of the config file, to configure these modules. - userdir.path = "public_html" -cgi.assign = (".cgi" => "" ) - With this done, lighttpd ran - immediately for me. If I had configured - lighttpd before Apache, I'd almost - certainly have run into many of the same system-level - configuration problems as I did with Apache. However, I - found lighttpd to be noticeably easier to - configure than Apache, even though I've used Apache for over - a decade, and this was my first exposure to - lighttpd. - - - - - Sharing multiple repositories with one CGI script - - The hgweb.cgi script - only lets you publish a single repository, which is an - annoying restriction. If you want to publish more than one - without wracking yourself with multiple copies of the same - script, each with different names, a better choice is to use - the hgwebdir.cgi - script. - - The procedure to configure hgwebdir.cgi is only a little more - involved than for hgweb.cgi. First, you must obtain - a copy of the script. If you don't have one handy, you can - download a copy from the master Mercurial repository at http://www.selenic.com/repo/hg/raw-file/tip/hgwebdir.cgi. - - You'll need to copy this script into your public_html directory, and - ensure that it's executable. - - cp .../hgwebdir.cgi ~/public_html -chmod 755 ~/public_html ~/public_html/hgwebdir.cgi - - With basic configuration out of the way, try to - visit http://myhostname/ - myuser/hgwebdir.cgi in your browser. It should - display an empty list of repositories. If you get a blank - window or error message, try walking through the list of - potential problems in . - - The hgwebdir.cgi - script relies on an external configuration file. By default, - it searches for a file named hgweb.config in the same directory - as itself. You'll need to create this file, and make it - world-readable. The format of the file is similar to a - Windows ini file, as understood by Python's - ConfigParser - web:configparser module. - - The easiest way to configure hgwebdir.cgi is with a section - named collections. This will automatically - publish every repository under the - directories you name. The section should look like - this: - [collections] -/my/root = /my/root - Mercurial interprets this by looking at the directory name - on the right hand side of the - = sign; finding repositories - in that directory hierarchy; and using the text on the - left to strip off matching text from the - names it will actually list in the web interface. The - remaining component of a path after this stripping has - occurred is called a virtual path. - - Given the example above, if we have a repository whose - local path is /my/root/this/repo, the CGI - script will strip the leading /my/root from the name, and - publish the repository with a virtual path of this/repo. If the base URL for - our CGI script is http://myhostname/ - myuser/hgwebdir.cgi, the complete URL for that - repository will be http://myhostname/ - myuser/hgwebdir.cgi/this/repo. - - If we replace /my/root on the left hand side - of this example with /my, then hgwebdir.cgi will only strip off - /my from the repository - name, and will give us a virtual path of root/this/repo instead of - this/repo. - - The hgwebdir.cgi - script will recursively search each directory listed in the - collections section of its configuration - file, but it will not recurse into the - repositories it finds. - - The collections mechanism makes it easy - to publish many repositories in a fire and - forget manner. You only need to set up the CGI - script and configuration file one time. Afterwards, you can - publish or unpublish a repository at any time by simply moving - it into, or out of, the directory hierarchy in which you've - configured hgwebdir.cgi to - look. - - - Explicitly specifying which repositories to - publish - - In addition to the collections - mechanism, the hgwebdir.cgi script allows you - to publish a specific list of repositories. To do so, - create a paths section, with contents of - the following form. - [paths] -repo1 = /my/path/to/some/repo -repo2 = /some/path/to/another - In this case, the virtual path (the component that will - appear in a URL) is on the left hand side of each - definition, while the path to the repository is on the - right. Notice that there does not need to be any - relationship between the virtual path you choose and the - location of a repository in your filesystem. - - If you wish, you can use both the - collections and paths - mechanisms simultaneously in a single configuration - file. - - - Beware duplicate virtual paths - - If several repositories have the same - virtual path, hgwebdir.cgi will not report - an error. Instead, it will behave unpredictably. - - - - - - Downloading source archives - - Mercurial's web interface lets users download an archive - of any revision. This archive will contain a snapshot of the - working directory as of that revision, but it will not contain - a copy of the repository data. - - By default, this feature is not enabled. To enable it, - you'll need to add an allow_archive item to the - web section of your ~/.hgrc; see below for details. - - - Web configuration options - - Mercurial's web interfaces (the hg - serve command, and the hgweb.cgi and hgwebdir.cgi scripts) have a - number of configuration options that you can set. These - belong in a section named web. - - allow_archive: Determines - which (if any) archive download mechanisms Mercurial - supports. If you enable this feature, users of the web - interface will be able to download an archive of whatever - revision of a repository they are viewing. To enable the - archive feature, this item must take the form of a - sequence of words drawn from the list below. - - bz2: A - tar archive, compressed using - bzip2 compression. This has the - best compression ratio, but uses the most CPU time on - the server. - - gz: A - tar archive, compressed using - gzip compression. - - zip: A - zip archive, compressed using LZW - compression. This format has the worst compression - ratio, but is widely used in the Windows world. - - - If you provide an empty list, or don't have an - allow_archive entry at - all, this feature will be disabled. Here is an example of - how to enable all three supported formats. - [web] -allow_archive = bz2 gz zip - - allowpull: - Boolean. Determines whether the web interface allows - remote users to hg pull - and hg clone this - repository over HTTP. If set to no or - false, only the - human-oriented portion of the web interface - is available. - - contact: - String. A free-form (but preferably brief) string - identifying the person or group in charge of the - repository. This often contains the name and email - address of a person or mailing list. It often makes sense - to place this entry in a repository's own .hg/hgrc file, but it can make - sense to use in a global ~/.hgrc if every repository - has a single maintainer. - - maxchanges: - Integer. The default maximum number of changesets to - display in a single page of output. - - maxfiles: - Integer. The default maximum number of modified files to - display in a single page of output. - - stripes: - Integer. If the web interface displays alternating - stripes to make it easier to visually align - rows when you are looking at a table, this number controls - the number of rows in each stripe. - - style: Controls the template - Mercurial uses to display the web interface. Mercurial - ships with several web templates. - - - coal is monochromatic. - - - gitweb emulates the visual - style of git's web interface. - - - monoblue uses solid blues and - greys. - - - paper is the default. - - - spartan was the default for a - long time. - - - You can - also specify a custom template of your own; see - for details. Here, you can - see how to enable the gitweb - style. - [web] -style = gitweb - - templates: - Path. The directory in which to search for template - files. By default, Mercurial searches in the directory in - which it was installed. - - If you are using hgwebdir.cgi, you can place a few - configuration items in a web - section of the hgweb.config file instead of a - ~/.hgrc file, for - convenience. These items are motd and style. - - - Options specific to an individual repository - - A few web configuration - items ought to be placed in a repository's local .hg/hgrc, rather than a user's - or global ~/.hgrc. - - description: String. A - free-form (but preferably brief) string that describes - the contents or purpose of the repository. - - name: - String. The name to use for the repository in the web - interface. This overrides the default name, which is - the last component of the repository's path. - - - - - Options specific to the <command role="hg-cmd">hg - serve</command> command - - Some of the items in the web section of a ~/.hgrc file are only for use - with the hg serve - command. - - accesslog: - Path. The name of a file into which to write an access - log. By default, the hg - serve command writes this information to - standard output, not to a file. Log entries are written - in the standard combined file format used - by almost all web servers. - - address: - String. The local address on which the server should - listen for incoming connections. By default, the server - listens on all addresses. - - errorlog: - Path. The name of a file into which to write an error - log. By default, the hg - serve command writes this information to - standard error, not to a file. - - ipv6: - Boolean. Whether to use the IPv6 protocol. By default, - IPv6 is not used. - - port: - Integer. The TCP port number on which the server should - listen. The default port number used is 8000. - - - - - Choosing the right <filename - role="special">~/.hgrc</filename> file to add <literal - role="rc-web">web</literal> items to - - It is important to remember that a web server like - Apache or lighttpd will run under a user - ID that is different to yours. CGI scripts run by your - server, such as hgweb.cgi, will usually also run - under that user ID. - - If you add web items to - your own personal ~/.hgrc file, CGI scripts won't read that - ~/.hgrc file. Those - settings will thus only affect the behavior of the hg serve command when you run it. - To cause CGI scripts to see your settings, either create a - ~/.hgrc file in the - home directory of the user ID that runs your web server, or - add those settings to a system-wide hgrc file. - - - - - - System-wide configuration - - On Unix-like systems shared by multiple users (such as a - server to which people publish changes), it often makes sense to - set up some global default behaviors, such as what theme to use - in web interfaces. - - If a file named /etc/mercurial/hgrc - exists, Mercurial will read it at startup time and apply any - configuration settings it finds in that file. It will also look - for files ending in a .rc extension in a - directory named /etc/mercurial/hgrc.d, and - apply any configuration settings it finds in each of those - files. - - - Making Mercurial more trusting - - One situation in which a global hgrc - can be useful is if users are pulling changes owned by other - users. By default, Mercurial will not trust most of the - configuration items in a .hg/hgrc file - inside a repository that is owned by a different user. If we - clone or pull changes from such a repository, Mercurial will - print a warning stating that it does not trust their - .hg/hgrc. - - If everyone in a particular Unix group is on the same team - and should trust each other's - configuration settings, or we want to trust particular users, - we can override Mercurial's skeptical defaults by creating a - system-wide hgrc file such as the - following: - - # Save this as e.g. /etc/mercurial/hgrc.d/trust.rc -[trusted] -# Trust all entries in any hgrc file owned by the "editors" or -# "www-data" groups. -groups = editors, www-data - -# Trust entries in hgrc files owned by the following users. -users = apache, bobo - - - -
- - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch05-daily.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch05-daily.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,840 @@ + + + + + Mercurial in daily use + + + Telling Mercurial which files to track + + Mercurial does not work with files in your repository unless + you tell it to manage them. The hg + status command will tell you which files Mercurial + doesn't know about; it uses a + ? to display such + files. + + To tell Mercurial to track a file, use the hg add command. Once you have added a + file, the entry in the output of hg + status for that file changes from + ? to + A. + + &interaction.daily.files.add; + + After you run a hg commit, + the files that you added before the commit will no longer be + listed in the output of hg + status. The reason for this is that by default, hg status only tells you about + interesting files&emdash;those that you have (for + example) modified, removed, or renamed. If you have a repository + that contains thousands of files, you will rarely want to know + about files that Mercurial is tracking, but that have not + changed. (You can still get this information; we'll return to + this later.) + + Once you add a file, Mercurial doesn't do anything with it + immediately. Instead, it will take a snapshot of the file's + state the next time you perform a commit. It will then continue + to track the changes you make to the file every time you commit, + until you remove the file. + + + Explicit versus implicit file naming + + A useful behavior that Mercurial has is that if you pass + the name of a directory to a command, every Mercurial command + will treat this as I want to operate on every file in + this directory and its subdirectories. + + &interaction.daily.files.add-dir; + + Notice in this example that Mercurial printed + the names of the files it added, whereas it didn't do so when + we added the file named myfile.txt in the + earlier example. + + What's going on is that in the former case, we explicitly + named the file to add on the command line. The assumption + that Mercurial makes in such cases is that we know what we + are doing, and it doesn't print any output. + + However, when we imply the names of + files by giving the name of a directory, Mercurial takes the + extra step of printing the name of each file that it does + something with. This makes it more clear what is happening, + and reduces the likelihood of a silent and nasty surprise. + This behavior is common to most Mercurial commands. + + + + Mercurial tracks files, not directories + + Mercurial does not track directory information. Instead, + it tracks the path to a file. Before creating a file, it + first creates any missing directory components of the path. + After it deletes a file, it then deletes any empty directories + that were in the deleted file's path. This sounds like a + trivial distinction, but it has one minor practical + consequence: it is not possible to represent a completely + empty directory in Mercurial. + + Empty directories are rarely useful, and there are + unintrusive workarounds that you can use to achieve an + appropriate effect. The developers of Mercurial thus felt + that the complexity that would be required to manage empty + directories was not worth the limited benefit this feature + would bring. + + If you need an empty directory in your repository, there + are a few ways to achieve this. One is to create a directory, + then hg add a + hidden file to that directory. On Unix-like + systems, any file name that begins with a period + (.) is treated as hidden by + most commands and GUI tools. This approach is illustrated + below. + +&interaction.daily.files.hidden; + + Another way to tackle a need for an empty directory is to + simply create one in your automated build scripts before they + will need it. + + + + + How to stop tracking a file + + Once you decide that a file no longer belongs in + your repository, use the hg + remove command. This deletes the file, and tells + Mercurial to stop tracking it (which will occur at the next + commit). A removed file is represented in the output of + hg status with a + R. + + &interaction.daily.files.remove; + + After you hg remove a file, + Mercurial will no longer track changes to that file, even if you + recreate a file with the same name in your working directory. + If you do recreate a file with the same name and want Mercurial + to track the new file, simply hg + add it. Mercurial will know that the newly added + file is not related to the old file of the same name. + + + Removing a file does not affect its history + + It is important to understand that removing a file has + only two effects. + + It removes the current version of the file + from the working directory. + + It stops Mercurial from tracking changes to + the file, from the time of the next commit. + + Removing a file does not in any way + alter the history of the file. + + If you update the working directory to a + changeset that was committed when it was still tracking a file + that you later removed, the file will reappear in the working + directory, with the contents it had when you committed that + changeset. If you then update the working directory to a + later changeset, in which the file had been removed, Mercurial + will once again remove the file from the working + directory. + + + + Missing files + + Mercurial considers a file that you have deleted, but not + used hg remove to delete, to + be missing. A missing file is + represented with ! in the + output of hg status. + Mercurial commands will not generally do anything with missing + files. + + &interaction.daily.files.missing; + + If your repository contains a file that hg status reports as missing, and + you want the file to stay gone, you can run hg remove at any + time later on, to tell Mercurial that you really did mean to + remove the file. + + &interaction.daily.files.remove-after; + + On the other hand, if you deleted the missing file by + accident, give hg revert the + name of the file to recover. It will reappear, in unmodified + form. + + &interaction.daily.files.recover-missing; + + + + Aside: why tell Mercurial explicitly to remove a + file? + + You might wonder why Mercurial requires you to explicitly + tell it that you are deleting a file. Early during the + development of Mercurial, it let you delete a file however you + pleased; Mercurial would notice the absence of the file + automatically when you next ran a hg + commit, and stop tracking the file. In practice, + this made it too easy to accidentally remove a file without + noticing. + + + + Useful shorthand&emdash;adding and removing files in one + step + + Mercurial offers a combination command, hg addremove, that adds untracked + files and marks missing files as removed. + + &interaction.daily.files.addremove; + + The hg commit command + also provides a + option that performs this same add-and-remove, immediately + followed by a commit. + + &interaction.daily.files.commit-addremove; + + + + + Copying files + + Mercurial provides a hg + copy command that lets you make a new copy of a + file. When you copy a file using this command, Mercurial makes + a record of the fact that the new file is a copy of the original + file. It treats these copied files specially when you merge + your work with someone else's. + + + The results of copying during a merge + + What happens during a merge is that changes + follow a copy. To best illustrate what this + means, let's create an example. We'll start with the usual + tiny repository that contains a single file. + + &interaction.daily.copy.init; + + We need to do some work in + parallel, so that we'll have something to merge. So let's + clone our repository. + + &interaction.daily.copy.clone; + + Back in our initial repository, let's use the hg copy command to make a copy of + the first file we created. + + &interaction.daily.copy.copy; + + If we look at the output of the hg + status command afterwards, the copied file looks + just like a normal added file. + + &interaction.daily.copy.status; + + But if we pass the option to hg status, it prints another line of + output: this is the file that our newly-added file was copied + from. + + &interaction.daily.copy.status-copy; + + Now, back in the repository we cloned, let's make a change + in parallel. We'll add a line of content to the original file + that we created. + + &interaction.daily.copy.other; + + Now we have a modified file in this + repository. When we pull the changes from the first + repository, and merge the two heads, Mercurial will propagate + the changes that we made locally to file + into its copy, new-file. + + &interaction.daily.copy.merge; + + + + Why should changes follow copies? + + This behavior&emdash;of changes to a file + propagating out to copies of the file&emdash;might seem + esoteric, but in most cases it's highly desirable. + + First of all, remember that this propagation + only happens when you merge. So if you + hg copy a file, and + subsequently modify the original file during the normal course + of your work, nothing will happen. + + The second thing to know is that modifications will only + propagate across a copy as long as the changeset that you're + merging changes from hasn't yet seen + the copy. + + The reason that Mercurial does this is as follows. Let's + say I make an important bug fix in a source file, and commit + my changes. Meanwhile, you've decided to hg copy the file in your repository, + without knowing about the bug or having seen the fix, and you + have started hacking on your copy of the file. + + If you pulled and merged my changes, and Mercurial + didn't propagate changes across copies, + your new source file would now contain the bug, and unless you + knew to propagate the bug fix by hand, the bug would + remain in your copy of the file. + + By automatically propagating the change that fixed the bug + from the original file to the copy, Mercurial prevents this + class of problem. To my knowledge, Mercurial is the + only revision control system that + propagates changes across copies like this. + + Once your change history has a record that the copy and + subsequent merge occurred, there's usually no further need to + propagate changes from the original file to the copied file, + and that's why Mercurial only propagates changes across copies + at the first merge, and not afterwards. + + + + How to make changes <emphasis>not</emphasis> follow a + copy + + If, for some reason, you decide that this business of + automatically propagating changes across copies is not for + you, simply use your system's normal file copy command (on + Unix-like systems, that's cp) to make a + copy of a file, then hg add + the new copy by hand. Before you do so, though, please do + reread , and make + an informed + decision that this behavior is not appropriate to your + specific case. + + + + Behavior of the <command role="hg-cmd">hg copy</command> + command + + When you use the hg copy + command, Mercurial makes a copy of each source file as it + currently stands in the working directory. This means that if + you make some modifications to a file, then hg copy it without first having + committed those changes, the new copy will also contain the + modifications you have made up until that point. (I find this + behavior a little counterintuitive, which is why I mention it + here.) + + The hg copy + command acts similarly to the Unix cp + command (you can use the hg + cp alias if you prefer). We must supply two or + more arguments, of which the last is treated as the + destination, and all others are + sources. + + If you pass hg copy a + single file as the source, and the destination does not exist, + it creates a new file with that name. + + &interaction.daily.copy.simple; + + If the destination is a directory, Mercurial copies its + sources into that directory. + + &interaction.daily.copy.dir-dest; + + Copying a directory is + recursive, and preserves the directory structure of the + source. + + &interaction.daily.copy.dir-src; + + If the source and destination are both directories, the + source tree is recreated in the destination directory. + + &interaction.daily.copy.dir-src-dest; + + As with the hg remove + command, if you copy a file manually and then want Mercurial + to know that you've copied the file, simply use the option to hg copy. + + &interaction.daily.copy.after; + + + + + Renaming files + + It's rather more common to need to rename a file than to + make a copy of it. The reason I discussed the hg copy command before talking about + renaming files is that Mercurial treats a rename in essentially + the same way as a copy. Therefore, knowing what Mercurial does + when you copy a file tells you what to expect when you rename a + file. + + When you use the hg rename + command, Mercurial makes a copy of each source file, then + deletes it and marks the file as removed. + + &interaction.daily.rename.rename; + + The hg status command shows + the newly copied file as added, and the copied-from file as + removed. + + &interaction.daily.rename.status; + + As with the results of a hg + copy, we must use the option to hg status to see that the added file + is really being tracked by Mercurial as a copy of the original, + now removed, file. + + &interaction.daily.rename.status-copy; + + As with hg remove and + hg copy, you can tell Mercurial + about a rename after the fact using the option. In most other + respects, the behavior of the hg + rename command, and the options it accepts, are + similar to the hg copy + command. + + If you're familiar with the Unix command line, you'll be + glad to know that hg rename + command can be invoked as hg + mv. + + + Renaming files and merging changes + + Since Mercurial's rename is implemented as + copy-and-remove, the same propagation of changes happens when + you merge after a rename as after a copy. + + If I modify a file, and you rename it to a new name, and + then we merge our respective changes, my modifications to the + file under its original name will be propagated into the file + under its new name. (This is something you might expect to + simply work, but not all revision control + systems actually do this.) + + Whereas having changes follow a copy is a feature where + you can perhaps nod and say yes, that might be + useful, it should be clear that having them follow a + rename is definitely important. Without this facility, it + would simply be too easy for changes to become orphaned when + files are renamed. + + + + Divergent renames and merging + + The case of diverging names occurs when two developers + start with a file&emdash;let's call it + foo&emdash;in their respective + repositories. + + &interaction.rename.divergent.clone; + + Anne renames the file to bar. + + &interaction.rename.divergent.rename.anne; + + Meanwhile, Bob renames it to + quux. (Remember that hg mv is an alias for hg rename.) + + &interaction.rename.divergent.rename.bob; + + I like to think of this as a conflict because each + developer has expressed different intentions about what the + file ought to be named. + + What do you think should happen when they merge their + work? Mercurial's actual behavior is that it always preserves + both names when it merges changesets that + contain divergent renames. + + &interaction.rename.divergent.merge; + + Notice that while Mercurial warns about the divergent + renames, it leaves it up to you to do something about the + divergence after the merge. + + + + Convergent renames and merging + + Another kind of rename conflict occurs when two people + choose to rename different source files + to the same destination. In this case, + Mercurial runs its normal merge machinery, and lets you guide + it to a suitable resolution. + + + + Other name-related corner cases + + Mercurial has a longstanding bug in which it fails to + handle a merge where one side has a file with a given name, + while another has a directory with the same name. This is + documented as issue + 29. + + &interaction.issue29.go; + + + + + + Recovering from mistakes + + Mercurial has some useful commands that will help you to + recover from some common mistakes. + + The hg revert command lets + you undo changes that you have made to your working directory. + For example, if you hg add a + file by accident, just run hg + revert with the name of the file you added, and + while the file won't be touched in any way, it won't be tracked + for adding by Mercurial any longer, either. You can also use + hg revert to get rid of + erroneous changes to a file. + + It is helpful to remember that the hg revert command is useful for + changes that you have not yet committed. Once you've committed + a change, if you decide it was a mistake, you can still do + something about it, though your options may be more + limited. + + For more information about the hg revert command, and details about + how to deal with changes you have already committed, see . + + + + Dealing with tricky merges + + In a complicated or large project, it's not unusual for a + merge of two changesets to result in some headaches. Suppose + there's a big source file that's been extensively edited by each + side of a merge: this is almost inevitably going to result in + conflicts, some of which can take a few tries to sort + out. + + Let's develop a simple case of this and see how to deal with + it. We'll start off with a repository containing one file, and + clone it twice. + + &interaction.ch04-resolve.init; + + In one clone, we'll modify the file in one way. + + &interaction.ch04-resolve.left; + + In another, we'll modify the file differently. + + &interaction.ch04-resolve.right; + + Next, we'll pull each set of changes into our original + repo. + + &interaction.ch04-resolve.pull; + + We expect our repository to now contain two heads. + + &interaction.ch04-resolve.heads; + + Normally, if we run hg + merge at this point, it will drop us into a GUI that + will let us manually resolve the conflicting edits to + myfile.txt. However, to simplify things + for presentation here, we'd like the merge to fail immediately + instead. Here's one way we can do so. + + &interaction.ch04-resolve.export; + + We've told Mercurial's merge machinery to run the command + false (which, as we desire, fails + immediately) if it detects a merge that it can't sort out + automatically. + + If we now fire up hg + merge, it should grind to a halt and report a + failure. + + &interaction.ch04-resolve.merge; + + Even if we don't notice that the merge failed, Mercurial + will prevent us from accidentally committing the result of a + failed merge. + + &interaction.ch04-resolve.cifail; + + When hg commit fails in + this case, it suggests that we use the unfamiliar hg resolve command. As usual, + hg help resolve will print a + helpful synopsis. + + + File resolution states + + When a merge occurs, most files will usually remain + unmodified. For each file where Mercurial has to do + something, it tracks the state of the file. + + + + A resolved file has been + successfully merged, either automatically by Mercurial or + manually with human intervention. + + + An unresolved file was not merged + successfully, and needs more attention. + + + + If Mercurial sees any file in the + unresolved state after a merge, it considers the merge to have + failed. Fortunately, we do not need to restart the entire + merge from scratch. + + The or + option to hg resolve prints out the state of + each merged file. + + &interaction.ch04-resolve.list; + + In the output from hg + resolve, a resolved file is marked with + R, while an unresolved file is marked with + U. If any files are listed with + U, we know that an attempt to commit the + results of the merge will fail. + + + + Resolving a file merge + + We have several options to move a file from the unresolved + into the resolved state. By far the most common is to rerun + hg resolve. If we pass the + names of individual files or directories, it will retry the + merges of any unresolved files present in those locations. We + can also pass the + or option, which + will retry the merges of all unresolved + files. + + Mercurial also lets us modify the resolution state of a + file directly. We can manually mark a file as resolved using + the option, or + as unresolved using the option. This allows + us to clean up a particularly messy merge by hand, and to keep + track of our progress with each file as we go. + + + + + More useful diffs + + The default output of the hg + diff command is backwards compatible with the + regular diff command, but this has some + drawbacks. + + Consider the case where we use hg + rename to rename a file. + + &interaction.ch04-diff.rename.basic; + + The output of hg diff above + obscures the fact that we simply renamed a file. The hg diff command accepts an option, + or , to use a newer + diff format that displays such information in a more readable + form. + + &interaction.ch04-diff.rename.git; + + This option also helps with a case that can otherwise be + confusing: a file that appears to be modified according to + hg status, but for which + hg diff prints nothing. This + situation can arise if we change the file's execute + permissions. + + &interaction.ch04-diff.chmod; + + The normal diff command pays no attention + to file permissions, which is why hg + diff prints nothing by default. If we supply it + with the option, it tells us what really + happened. + + &interaction.ch04-diff.chmod.git; + + + + Which files to manage, and which to avoid + + Revision control systems are generally best at managing text + files that are written by humans, such as source code, where the + files do not change much from one revision to the next. Some + centralized revision control systems can also deal tolerably + well with binary files, such as bitmap images. + + For instance, a game development team will typically manage + both its source code and all of its binary assets (e.g. geometry + data, textures, map layouts) in a revision control + system. + + Because it is usually impossible to merge two conflicting + modifications to a binary file, centralized systems often + provide a file locking mechanism that allow a user to say + I am the only person who can edit this + file. + + Compared to a centralized system, a distributed revision + control system changes some of the factors that guide decisions + over which files to manage and how. + + For instance, a distributed revision control system cannot, + by its nature, offer a file locking facility. There is thus no + built-in mechanism to prevent two people from making conflicting + changes to a binary file. If you have a team where several + people may be editing binary files frequently, it may not be a + good idea to use Mercurial&emdash;or any other distributed + revision control system&emdash;to manage those files. + + When storing modifications to a file, Mercurial usually + saves only the differences between the previous and current + versions of the file. For most text files, this is extremely + efficient. However, some files (particularly binary files) are + laid out in such a way that even a small change to a file's + logical content results in many or most of the bytes inside the + file changing. For instance, compressed files are particularly + susceptible to this. If the differences between each successive + version of a file are always large, Mercurial will not be able + to store the file's revision history very efficiently. This can + affect both local storage needs and the amount of time it takes + to clone a repository. + + To get an idea of how this could affect you in practice, + suppose you want to use Mercurial to manage an OpenOffice + document. OpenOffice stores documents on disk as compressed zip + files. Edit even a single letter of your document in OpenOffice, + and almost every byte in the entire file will change when you + save it. Now suppose that file is 2MB in size. Because most of + the file changes every time you save, Mercurial will have to + store all 2MB of the file every time you commit, even though + from your perspective, perhaps only a few words are changing + each time. A single frequently-edited file that is not friendly + to Mercurial's storage assumptions can easily have an outsized + effect on the size of the repository. + + Even worse, if both you and someone else edit the OpenOffice + document you're working on, there is no useful way to merge your + work. In fact, there isn't even a good way to tell what the + differences are between your respective changes. + + There are thus a few clear recommendations about specific + kinds of files to be very careful with. + + + + Files that are very large and incompressible, e.g. ISO + CD-ROM images, will by virtue of sheer size make clones over + a network very slow. + + + Files that change a lot from one revision to the next + may be expensive to store if you edit them frequently, and + conflicts due to concurrent edits may be difficult to + resolve. + + + + + + Backups and mirroring + + Since Mercurial maintains a complete copy of history in each + clone, everyone who uses Mercurial to collaborate on a project + can potentially act as a source of backups in the event of a + catastrophe. If a central repository becomes unavailable, you + can construct a replacement simply by cloning a copy of the + repository from one contributor, and pulling any changes they + may not have seen from others. + + It is simple to use Mercurial to perform off-site backups + and remote mirrors. Set up a periodic job (e.g. via the + cron command) on a remote server to pull + changes from your master repositories every hour. This will + only be tricky in the unlikely case that the number of master + repositories you maintain changes frequently, in which case + you'll need to do a little scripting to refresh the list of + repositories to back up. + + If you perform traditional backups of your master + repositories to tape or disk, and you want to back up a + repository named myrepo, use hg + clone -U myrepo myrepo.bak to create a + clone of myrepo before you start your + backups. The option doesn't check out a + working directory after the clone completes, since that would be + superfluous and make the backup take longer. + + If you then back up myrepo.bak instead + of myrepo, you will be guaranteed to have a + consistent snapshot of your repository that won't be pushed to + by an insomniac developer in mid-backup. + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch06-collab.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch06-collab.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,1565 @@ + + + + + Collaborating with other people + + As a completely decentralised tool, Mercurial doesn't impose + any policy on how people ought to work with each other. However, + if you're new to distributed revision control, it helps to have + some tools and examples in mind when you're thinking about + possible workflow models. + + + Mercurial's web interface + + Mercurial has a powerful web interface that provides several + useful capabilities. + + For interactive use, the web interface lets you browse a + single repository or a collection of repositories. You can view + the history of a repository, examine each change (comments and + diffs), and view the contents of each directory and file. You + can even get a view of history that gives a graphical view of + the relationships between individual changes and merges. + + Also for human consumption, the web interface provides + Atom and RSS feeds of the changes in a repository. This lets you + subscribe to a repository using your favorite + feed reader, and be automatically notified of activity in that + repository as soon as it happens. I find this capability much + more convenient than the model of subscribing to a mailing list + to which notifications are sent, as it requires no additional + configuration on the part of whoever is serving the + repository. + + The web interface also lets remote users clone a repository, + pull changes from it, and (when the server is configured to + permit it) push changes back to it. Mercurial's HTTP tunneling + protocol aggressively compresses data, so that it works + efficiently even over low-bandwidth network connections. + + The easiest way to get started with the web interface is to + use your web browser to visit an existing repository, such as + the master Mercurial repository at http://www.selenic.com/repo/hg. + + If you're interested in providing a web interface + to your own repositories, there are several good ways to do + this. + + The easiest and fastest way to get started in an informal + environment is to use the hg + serve command, which is best suited to short-term + lightweight serving. See below for details of how to use + this command. + + For longer-lived repositories that you'd like to + have permanently available, there are several public hosting + services available. Some are free to open source projects, + while others offer paid commercial hosting. An up-to-date list + is available at http://www.selenic.com/mercurial/wiki/index.cgi/MercurialHosting. + + If you would prefer to host your own repositories, Mercurial + has built-in support for several popular hosting technologies, + most notably CGI (Common Gateway Interface), and WSGI (Web + Services Gateway Interface). See for details of CGI and WSGI + configuration. + + + + Collaboration models + + With a suitably flexible tool, making decisions about + workflow is much more of a social engineering challenge than a + technical one. Mercurial imposes few limitations on how you can + structure the flow of work in a project, so it's up to you and + your group to set up and live with a model that matches your own + particular needs. + + + Factors to keep in mind + + The most important aspect of any model that you must keep + in mind is how well it matches the needs and capabilities of + the people who will be using it. This might seem + self-evident; even so, you still can't afford to forget it for + a moment. + + I once put together a workflow model that seemed to make + perfect sense to me, but that caused a considerable amount of + consternation and strife within my development team. In spite + of my attempts to explain why we needed a complex set of + branches, and how changes ought to flow between them, a few + team members revolted. Even though they were smart people, + they didn't want to pay attention to the constraints we were + operating under, or face the consequences of those constraints + in the details of the model that I was advocating. + + Don't sweep foreseeable social or technical problems under + the rug. Whatever scheme you put into effect, you should plan + for mistakes and problem scenarios. Consider adding automated + machinery to prevent, or quickly recover from, trouble that + you can anticipate. As an example, if you intend to have a + branch with not-for-release changes in it, you'd do well to + think early about the possibility that someone might + accidentally merge those changes into a release branch. You + could avoid this particular problem by writing a hook that + prevents changes from being merged from an inappropriate + branch. + + + + Informal anarchy + + I wouldn't suggest an anything goes + approach as something sustainable, but it's a model that's + easy to grasp, and it works perfectly well in a few unusual + situations. + + As one example, many projects have a loose-knit group of + collaborators who rarely physically meet each other. Some + groups like to overcome the isolation of working at a distance + by organizing occasional sprints. In a sprint, + a number of people get together in a single location (a + company's conference room, a hotel meeting room, that kind of + place) and spend several days more or less locked in there, + hacking intensely on a handful of projects. + + A sprint or a hacking session in a coffee shop are the perfect places to use the + hg serve command, since + hg serve does not require any + fancy server infrastructure. You can get started with + hg serve in moments, by + reading below. Then simply + tell the person next to you that you're running a server, send + the URL to them in an instant message, and you immediately + have a quick-turnaround way to work together. They can type + your URL into their web browser and quickly review your + changes; or they can pull a bugfix from you and verify it; or + they can clone a branch containing a new feature and try it + out. + + The charm, and the problem, with doing things + in an ad hoc fashion like this is that only people who know + about your changes, and where they are, can see them. Such an + informal approach simply doesn't scale beyond a handful + people, because each individual needs to know about + n different repositories to pull + from. + + + + A single central repository + + For smaller projects migrating from a centralised revision + control tool, perhaps the easiest way to get started is to + have changes flow through a single shared central repository. + This is also the most common building block for + more ambitious workflow schemes. + + Contributors start by cloning a copy of this repository. + They can pull changes from it whenever they need to, and some + (perhaps all) developers have permission to push a change back + when they're ready for other people to see it. + + Under this model, it can still often make sense for people + to pull changes directly from each other, without going + through the central repository. Consider a case in which I + have a tentative bug fix, but I am worried that if I were to + publish it to the central repository, it might subsequently + break everyone else's trees as they pull it. To reduce the + potential for damage, I can ask you to clone my repository + into a temporary repository of your own and test it. This + lets us put off publishing the potentially unsafe change until + it has had a little testing. + + If a team is hosting its own repository in this + kind of scenario, people will usually use the + ssh protocol to securely push changes to + the central repository, as documented in . It's also usual to publish a + read-only copy of the repository over HTTP, as in + . Publishing over HTTP + satisfies the needs of people who don't have push access, and + those who want to use web browsers to browse the repository's + history. + + + + A hosted central repository + + A wonderful thing about public hosting services like + Bitbucket is that + not only do they handle the fiddly server configuration + details, such as user accounts, authentication, and secure + wire protocols, they provide additional infrastructure to make + this model work well. + + For instance, a well-engineered hosting service will let + people clone their own copies of a repository with a single + click. This lets people work in separate spaces and share + their changes when they're ready. + + In addition, a good hosting service will let people + communicate with each other, for instance to say there + are changes ready for you to review in this + tree. + + + + Working with multiple branches + + Projects of any significant size naturally tend to make + progress on several fronts simultaneously. In the case of + software, it's common for a project to go through periodic + official releases. A release might then go into + maintenance mode for a while after its first + publication; maintenance releases tend to contain only bug + fixes, not new features. In parallel with these maintenance + releases, one or more future releases may be under + development. People normally use the word + branch to refer to one of these many slightly + different directions in which development is + proceeding. + + Mercurial is particularly well suited to managing a number + of simultaneous, but not identical, branches. Each + development direction can live in its own + central repository, and you can merge changes from one to + another as the need arises. Because repositories are + independent of each other, unstable changes in a development + branch will never affect a stable branch unless someone + explicitly merges those changes into the stable branch. + + Here's an example of how this can work in practice. Let's + say you have one main branch on a central + server. + + &interaction.branching.init; + + People clone it, make changes locally, test them, and push + them back. + + Once the main branch reaches a release milestone, you can + use the hg tag command to + give a permanent name to the milestone revision. + + &interaction.branching.tag; + + Let's say some ongoing + development occurs on the main branch. + + &interaction.branching.main; + + Using the tag that was recorded at the milestone, people + who clone that repository at any time in the future can use + hg update to get a copy of + the working directory exactly as it was when that tagged + revision was committed. + + &interaction.branching.update; + + In addition, immediately after the main branch is tagged, + we can then clone the main branch on the server to a new + stable branch, also on the server. + + &interaction.branching.clone; + + If we need to make a change to the stable + branch, we can then clone that + repository, make our changes, commit, and push our changes + back there. + + &interaction.branching.stable; + + Because Mercurial repositories are independent, and + Mercurial doesn't move changes around automatically, the + stable and main branches are isolated + from each other. The changes that we made on the main branch + don't leak to the stable branch, and vice + versa. + + We'll often want all of our bugfixes on the stable + branch to show up on the main branch, too. Rather than + rewrite a bugfix on the main branch, we can simply pull and + merge changes from the stable to the main branch, and + Mercurial will bring those bugfixes in for us. + + &interaction.branching.merge; + + The main branch will still contain changes that + are not on the stable branch, but it will also contain all of + the bugfixes from the stable branch. The stable branch + remains unaffected by these changes, since changes are only + flowing from the stable to the main branch, and not the other + way. + + + + Feature branches + + For larger projects, an effective way to manage change is + to break up a team into smaller groups. Each group has a + shared branch of its own, cloned from a single + master branch used by the entire project. + People working on an individual branch are typically quite + isolated from developments on other branches. + +
+ Feature branches + + + XXX add text + +
+ + When a particular feature is deemed to be in suitable + shape, someone on that feature team pulls and merges from the + master branch into the feature branch, then pushes back up to + the master branch. +
+ + + The release train + + Some projects are organized on a train + basis: a release is scheduled to happen every few months, and + whatever features are ready when the train is + ready to leave are allowed in. + + This model resembles working with feature branches. The + difference is that when a feature branch misses a train, + someone on the feature team pulls and merges the changes that + went out on that train release into the feature branch, and + the team continues its work on top of that release so that + their feature can make the next release. + + + + The Linux kernel model + + The development of the Linux kernel has a shallow + hierarchical structure, surrounded by a cloud of apparent + chaos. Because most Linux developers use + git, a distributed revision control tool + with capabilities similar to Mercurial, it's useful to + describe the way work flows in that environment; if you like + the ideas, the approach translates well across tools. + + At the center of the community sits Linus Torvalds, the + creator of Linux. He publishes a single source repository + that is considered the authoritative current + tree by the entire developer community. Anyone can clone + Linus's tree, but he is very choosy about whose trees he pulls + from. + + Linus has a number of trusted lieutenants. + As a general rule, he pulls whatever changes they publish, in + most cases without even reviewing those changes. Some of + those lieutenants are generally agreed to be + maintainers, responsible for specific + subsystems within the kernel. If a random kernel hacker wants + to make a change to a subsystem that they want to end up in + Linus's tree, they must find out who the subsystem's + maintainer is, and ask that maintainer to take their change. + If the maintainer reviews their changes and agrees to take + them, they'll pass them along to Linus in due course. + + Individual lieutenants have their own approaches to + reviewing, accepting, and publishing changes; and for deciding + when to feed them to Linus. In addition, there are several + well known branches that people use for different purposes. + For example, a few people maintain stable + repositories of older versions of the kernel, to which they + apply critical fixes as needed. Some maintainers publish + multiple trees: one for experimental changes; one for changes + that they are about to feed upstream; and so on. Others just + publish a single tree. + + This model has two notable features. The first is that + it's pull only. You have to ask, convince, or + beg another developer to take a change from you, because there + are almost no trees to which more than one person can push, + and there's no way to push changes into a tree that someone + else controls. + + The second is that it's based on reputation and acclaim. + If you're an unknown, Linus will probably ignore changes from + you without even responding. But a subsystem maintainer will + probably review them, and will likely take them if they pass + their criteria for suitability. The more good + changes you contribute to a maintainer, the more likely they + are to trust your judgment and accept your changes. If you're + well-known and maintain a long-lived branch for something + Linus hasn't yet accepted, people with similar interests may + pull your changes regularly to keep up with your work. + + Reputation and acclaim don't necessarily cross subsystem + or people boundaries. If you're a respected + but specialised storage hacker, and you try to fix a + networking bug, that change will receive a level of scrutiny + from a network maintainer comparable to a change from a + complete stranger. + + To people who come from more orderly project backgrounds, + the comparatively chaotic Linux kernel development process + often seems completely insane. It's subject to the whims of + individuals; people make sweeping changes whenever they deem + it appropriate; and the pace of development is astounding. + And yet Linux is a highly successful, well-regarded piece of + software. + + + + Pull-only versus shared-push collaboration + + A perpetual source of heat in the open source community is + whether a development model in which people only ever pull + changes from others is better than one in which + multiple people can push changes to a shared + repository. + + Typically, the backers of the shared-push model use tools + that actively enforce this approach. If you're using a + centralised revision control tool such as Subversion, there's + no way to make a choice over which model you'll use: the tool + gives you shared-push, and if you want to do anything else, + you'll have to roll your own approach on top (such as applying + a patch by hand). + + A good distributed revision control tool will + support both models. You and your collaborators can then + structure how you work together based on your own needs and + preferences, not on what contortions your tools force you + into. + + + Where collaboration meets branch management + + Once you and your team set up some shared + repositories and start propagating changes back and forth + between local and shared repos, you begin to face a related, + but slightly different challenge: that of managing the + multiple directions in which your team may be moving at once. + Even though this subject is intimately related to how your + team collaborates, it's dense enough to merit treatment of its + own, in . + +
+ + + The technical side of sharing + + The remainder of this chapter is devoted to the question of + sharing changes with your collaborators. + + + + Informal sharing with <command role="hg-cmd">hg + serve</command> + + Mercurial's hg serve + command is wonderfully suited to small, tight-knit, and + fast-paced group environments. It also provides a great way to + get a feel for using Mercurial commands over a network. + + Run hg serve inside a + repository, and in under a second it will bring up a specialised + HTTP server; this will accept connections from any client, and + serve up data for that repository until you terminate it. + Anyone who knows the URL of the server you just started, and can + talk to your computer over the network, can then use a web + browser or Mercurial to read data from that repository. A URL + for a hg serve instance running + on a laptop is likely to look something like + http://my-laptop.local:8000/. + + The hg serve command is + not a general-purpose web server. It can do + only two things: + + Allow people to browse the history of the + repository it's serving, from their normal web + browsers. + + Speak Mercurial's wire protocol, so that people + can hg clone or hg pull changes from that + repository. + + In particular, hg serve + won't allow remote users to modify your + repository. It's intended for read-only use. + + If you're getting started with Mercurial, there's nothing to + prevent you from using hg serve + to serve up a repository on your own computer, then use commands + like hg clone, hg incoming, and so on to talk to that + server as if the repository was hosted remotely. This can help + you to quickly get acquainted with using commands on + network-hosted repositories. + + + A few things to keep in mind + + Because it provides unauthenticated read access to all + clients, you should only use hg + serve in an environment where you either don't + care, or have complete control over, who can access your + network and pull data from your repository. + + The hg serve command + knows nothing about any firewall software you might have + installed on your system or network. It cannot detect or + control your firewall software. If other people are unable to + talk to a running hg serve + instance, the second thing you should do + (after you make sure that they're using + the correct URL) is check your firewall configuration. + + By default, hg serve + listens for incoming connections on port 8000. If another + process is already listening on the port you want to use, you + can specify a different port to listen on using the option. + + Normally, when hg serve + starts, it prints no output, which can be a bit unnerving. If + you'd like to confirm that it is indeed running correctly, and + find out what URL you should send to your collaborators, start + it with the + option. + + + + + Using the Secure Shell (ssh) protocol + + You can pull and push changes securely over a network + connection using the Secure Shell (ssh) + protocol. To use this successfully, you may have to do a little + bit of configuration on the client or server sides. + + If you're not familiar with ssh, it's the name of + both a command and a network protocol that let you securely + communicate with another computer. To use it with Mercurial, + you'll be setting up one or more user accounts on a server so + that remote users can log in and execute commands. + + (If you are familiar with ssh, you'll + probably find some of the material that follows to be elementary + in nature.) + + + How to read and write ssh URLs + + An ssh URL tends to look like this: + ssh://bos@hg.serpentine.com:22/hg/hgbook + + The ssh:// + part tells Mercurial to use the ssh protocol. + + The bos@ + component indicates what username to log into the server + as. You can leave this out if the remote username is the + same as your local username. + + The + hg.serpentine.com gives + the hostname of the server to log into. + + The :22 identifies the port + number to connect to the server on. The default port is + 22, so you only need to specify a colon and port number if + you're not using port 22. + + The remainder of the URL is the local path to + the repository on the server. + + + There's plenty of scope for confusion with the path + component of ssh URLs, as there is no standard way for tools + to interpret it. Some programs behave differently than others + when dealing with these paths. This isn't an ideal situation, + but it's unlikely to change. Please read the following + paragraphs carefully. + + Mercurial treats the path to a repository on the server as + relative to the remote user's home directory. For example, if + user foo on the server has a home directory + of /home/foo, then an + ssh URL that contains a path component of bar really + refers to the directory /home/foo/bar. + + If you want to specify a path relative to another user's + home directory, you can use a path that starts with a tilde + character followed by the user's name (let's call them + otheruser), like this. + ssh://server/~otheruser/hg/repo + + And if you really want to specify an + absolute path on the server, begin the + path component with two slashes, as in this example. + ssh://server//absolute/path + + + + Finding an ssh client for your system + + Almost every Unix-like system comes with OpenSSH + preinstalled. If you're using such a system, run + which ssh to find out if the + ssh command is installed (it's usually in + /usr/bin). In the + unlikely event that it isn't present, take a look at your + system documentation to figure out how to install it. + + On Windows, the TortoiseHg package is bundled + with a version of Simon Tatham's excellent + plink command, and you should not need to + do any further configuration. + + + + Generating a key pair + + To avoid the need to repetitively type a + password every time you need to use your ssh client, I + recommend generating a key pair. + + + Key pairs are not mandatory + + Mercurial knows nothing about ssh authentication or key + pairs. You can, if you like, safely ignore this section and + the one that follows until you grow tired of repeatedly + typing ssh passwords. + + + + + On a Unix-like system, the + ssh-keygen command will do the + trick. + On Windows, if you're using TortoiseHg, you may need + to download a command named puttygen + from the + PuTTY web site to generate a key pair. See + the + puttygen documentation for + details of how use the command. + + + + When you generate a key pair, it's usually + highly advisable to protect it with a + passphrase. (The only time that you might not want to do this + is when you're using the ssh protocol for automated tasks on a + secure network.) + + Simply generating a key pair isn't enough, however. + You'll need to add the public key to the set of authorised + keys for whatever user you're logging in remotely as. For + servers using OpenSSH (the vast majority), this will mean + adding the public key to a list in a file called authorized_keys in their .ssh + directory. + + On a Unix-like system, your public key will have a + .pub extension. If you're using + puttygen on Windows, you can save the + public key to a file of your choosing, or paste it from the + window it's displayed in straight into the authorized_keys file. + + + Using an authentication agent + + An authentication agent is a daemon that stores + passphrases in memory (so it will forget passphrases if you + log out and log back in again). An ssh client will notice if + it's running, and query it for a passphrase. If there's no + authentication agent running, or the agent doesn't store the + necessary passphrase, you'll have to type your passphrase + every time Mercurial tries to communicate with a server on + your behalf (e.g. whenever you pull or push changes). + + The downside of storing passphrases in an agent is that + it's possible for a well-prepared attacker to recover the + plain text of your passphrases, in some cases even if your + system has been power-cycled. You should make your own + judgment as to whether this is an acceptable risk. It + certainly saves a lot of repeated typing. + + + + On Unix-like systems, the agent is called + ssh-agent, and it's often run + automatically for you when you log in. You'll need to use + the ssh-add command to add passphrases + to the agent's store. + + + On Windows, if you're using TortoiseHg, the + pageant command acts as the agent. As + with puttygen, you'll need to download + pageant from the PuTTY web + site and read its + documentation. The pageant + command adds an icon to your system tray that will let you + manage stored passphrases. + + + + + + Configuring the server side properly + + Because ssh can be fiddly to set up if you're new to it, + a variety of things can go wrong. Add Mercurial + on top, and there's plenty more scope for head-scratching. + Most of these potential problems occur on the server side, not + the client side. The good news is that once you've gotten a + configuration working, it will usually continue to work + indefinitely. + + Before you try using Mercurial to talk to an ssh server, + it's best to make sure that you can use the normal + ssh or putty command to + talk to the server first. If you run into problems with using + these commands directly, Mercurial surely won't work. Worse, + it will obscure the underlying problem. Any time you want to + debug ssh-related Mercurial problems, you should drop back to + making sure that plain ssh client commands work first, + before you worry about whether there's a + problem with Mercurial. + + The first thing to be sure of on the server side is that + you can actually log in from another machine at all. If you + can't use ssh or putty + to log in, the error message you get may give you a few hints + as to what's wrong. The most common problems are as + follows. + + If you get a connection refused + error, either there isn't an SSH daemon running on the + server at all, or it's inaccessible due to firewall + configuration. + + If you get a no route to host + error, you either have an incorrect address for the server + or a seriously locked down firewall that won't admit its + existence at all. + + If you get a permission denied + error, you may have mistyped the username on the server, + or you could have mistyped your key's passphrase or the + remote user's password. + + In summary, if you're having trouble talking to the + server's ssh daemon, first make sure that one is running at + all. On many systems it will be installed, but disabled, by + default. Once you're done with this step, you should then + check that the server's firewall is configured to allow + incoming connections on the port the ssh daemon is listening + on (usually 22). Don't worry about more exotic possibilities + for misconfiguration until you've checked these two + first. + + If you're using an authentication agent on the client side + to store passphrases for your keys, you ought to be able to + log into the server without being prompted for a passphrase or + a password. If you're prompted for a passphrase, there are a + few possible culprits. + + You might have forgotten to use + ssh-add or pageant + to store the passphrase. + + You might have stored the passphrase for the + wrong key. + + If you're being prompted for the remote user's password, + there are another few possible problems to check. + + Either the user's home directory or their + .ssh + directory might have excessively liberal permissions. As + a result, the ssh daemon will not trust or read their + authorized_keys file. + For example, a group-writable home or .ssh + directory will often cause this symptom. + + The user's authorized_keys file may have + a problem. If anyone other than the user owns or can write + to that file, the ssh daemon will not trust or read + it. + + + In the ideal world, you should be able to run the + following command successfully, and it should print exactly + one line of output, the current date and time. + ssh myserver date + + If, on your server, you have login scripts that print + banners or other junk even when running non-interactive + commands like this, you should fix them before you continue, + so that they only print output if they're run interactively. + Otherwise these banners will at least clutter up Mercurial's + output. Worse, they could potentially cause problems with + running Mercurial commands remotely. Mercurial tries to + detect and ignore banners in non-interactive + ssh sessions, but it is not foolproof. (If + you're editing your login scripts on your server, the usual + way to see if a login script is running in an interactive + shell is to check the return code from the command + tty -s.) + + Once you've verified that plain old ssh is working with + your server, the next step is to ensure that Mercurial runs on + the server. The following command should run + successfully: + + ssh myserver hg version + + If you see an error message instead of normal hg version output, this is usually + because you haven't installed Mercurial to /usr/bin. Don't worry if this + is the case; you don't need to do that. But you should check + for a few possible problems. + + Is Mercurial really installed on the server at + all? I know this sounds trivial, but it's worth + checking! + + Maybe your shell's search path (usually set + via the PATH environment variable) is + simply misconfigured. + + Perhaps your PATH environment + variable is only being set to point to the location of the + hg executable if the login session is + interactive. This can happen if you're setting the path + in the wrong shell login script. See your shell's + documentation for details. + + The PYTHONPATH environment + variable may need to contain the path to the Mercurial + Python modules. It might not be set at all; it could be + incorrect; or it may be set only if the login is + interactive. + + + If you can run hg version + over an ssh connection, well done! You've got the server and + client sorted out. You should now be able to use Mercurial to + access repositories hosted by that username on that server. + If you run into problems with Mercurial and ssh at this point, + try using the + option to get a clearer picture of what's going on. + + + Using compression with ssh + + Mercurial does not compress data when it uses the ssh + protocol, because the ssh protocol can transparently compress + data. However, the default behavior of ssh clients is + not to request compression. + + Over any network other than a fast LAN (even a wireless + network), using compression is likely to significantly speed + up Mercurial's network operations. For example, over a WAN, + someone measured compression as reducing the amount of time + required to clone a particularly large repository from 51 + minutes to 17 minutes. + + Both ssh and plink + accept a option which + turns on compression. You can easily edit your ~/.hgrc to enable compression for + all of Mercurial's uses of the ssh protocol. Here is how to + do so for regular ssh on Unix-like systems, + for example. + [ui] +ssh = ssh -C + + If you use ssh on a + Unix-like system, you can configure it to always use + compression when talking to your server. To do this, edit + your .ssh/config file + (which may not yet exist), as follows. + + Host hg + Compression yes + HostName hg.example.com + + This defines a hostname alias, + hg. When you use that hostname on the + ssh command line or in a Mercurial + ssh-protocol URL, it will cause + ssh to connect to + hg.example.com and use compression. This + gives you both a shorter name to type and compression, each of + which is a good thing in its own right. + + + + + Serving over HTTP using CGI + + The simplest way to host one or more repositories in a + permanent way is to use a web server and Mercurial's CGI + support. + + Depending on how ambitious you are, configuring Mercurial's + CGI interface can take anything from a few moments to several + hours. + + We'll begin with the simplest of examples, and work our way + towards a more complex configuration. Even for the most basic + case, you're almost certainly going to need to read and modify + your web server's configuration. + + + High pain tolerance required + + Configuring a web server is a complex, fiddly, + and highly system-dependent activity. I can't possibly give + you instructions that will cover anything like all of the + cases you will encounter. Please use your discretion and + judgment in following the sections below. Be prepared to make + plenty of mistakes, and to spend a lot of time reading your + server's error logs. + + If you don't have a strong stomach for tweaking + configurations over and over, or a compelling need to host + your own services, you might want to try one of the public + hosting services that I mentioned earlier. + + + + Web server configuration checklist + + Before you continue, do take a few moments to check a few + aspects of your system's setup. + + + Do you have a web server installed + at all? Mac OS X and some Linux distributions ship with + Apache, but many other systems may not have a web server + installed. + + If you have a web server installed, is it + actually running? On most systems, even if one is + present, it will be disabled by default. + + Is your server configured to allow you to run + CGI programs in the directory where you plan to do so? + Most servers default to explicitly disabling the ability + to run CGI programs. + + + If you don't have a web server installed, and don't have + substantial experience configuring Apache, you should consider + using the lighttpd web server instead of + Apache. Apache has a well-deserved reputation for baroque and + confusing configuration. While lighttpd is + less capable in some ways than Apache, most of these + capabilities are not relevant to serving Mercurial + repositories. And lighttpd is undeniably + much easier to get started with than + Apache. + + + + Basic CGI configuration + + On Unix-like systems, it's common for users to have a + subdirectory named something like public_html in their home + directory, from which they can serve up web pages. A file + named foo in this directory will be + accessible at a URL of the form + http://www.example.com/username/foo. + + To get started, find the hgweb.cgi script that should be + present in your Mercurial installation. If you can't quickly + find a local copy on your system, simply download one from the + master Mercurial repository at http://www.selenic.com/repo/hg/raw-file/tip/hgweb.cgi. + + You'll need to copy this script into your public_html directory, and + ensure that it's executable. + cp .../hgweb.cgi ~/public_html +chmod 755 ~/public_html/hgweb.cgi + The 755 argument to + chmod is a little more general than just + making the script executable: it ensures that the script is + executable by anyone, and that group and + other write permissions are + not set. If you were to leave those + write permissions enabled, Apache's suexec + subsystem would likely refuse to execute the script. In fact, + suexec also insists that the + directory in which the script resides + must not be writable by others. + chmod 755 ~/public_html + + + What could <emphasis>possibly</emphasis> go + wrong? + + Once you've copied the CGI script into place, + go into a web browser, and try to open the URL + http://myhostname/~myuser/hgweb.cgi, + but brace yourself for instant failure. + There's a high probability that trying to visit this URL + will fail, and there are many possible reasons for this. In + fact, you're likely to stumble over almost every one of the + possible errors below, so please read carefully. The + following are all of the problems I ran into on a system + running Fedora 7, with a fresh installation of Apache, and a + user account that I created specially to perform this + exercise. + + Your web server may have per-user directories disabled. + If you're using Apache, search your config file for a + UserDir directive. If there's none + present, per-user directories will be disabled. If one + exists, but its value is disabled, then + per-user directories will be disabled. Otherwise, the + string after UserDir gives the name of + the subdirectory that Apache will look in under your home + directory, for example public_html. + + Your file access permissions may be too restrictive. + The web server must be able to traverse your home directory + and directories under your public_html directory, and + read files under the latter too. Here's a quick recipe to + help you to make your permissions more appropriate. + chmod 755 ~ +find ~/public_html -type d -print0 | xargs -0r chmod 755 +find ~/public_html -type f -print0 | xargs -0r chmod 644 + + The other possibility with permissions is that you might + get a completely empty window when you try to load the + script. In this case, it's likely that your access + permissions are too permissive. Apache's + suexec subsystem won't execute a script + that's group- or world-writable, for example. + + Your web server may be configured to disallow execution + of CGI programs in your per-user web directory. Here's + Apache's default per-user configuration from my Fedora + system. + + &ch06-apache-config.lst; + + If you find a similar-looking + Directory group in your Apache + configuration, the directive to look at inside it is + Options. Add ExecCGI + to the end of this list if it's missing, and restart the web + server. + + If you find that Apache serves you the text of the CGI + script instead of executing it, you may need to either + uncomment (if already present) or add a directive like + this. + AddHandler cgi-script .cgi + + The next possibility is that you might be served with a + colourful Python backtrace claiming that it can't import a + mercurial-related module. This is + actually progress! The server is now capable of executing + your CGI script. This error is only likely to occur if + you're running a private installation of Mercurial, instead + of a system-wide version. Remember that the web server runs + the CGI program without any of the environment variables + that you take for granted in an interactive session. If + this error happens to you, edit your copy of hgweb.cgi and follow the + directions inside it to correctly set your + PYTHONPATH environment variable. + + Finally, you are certain to be + served with another colourful Python backtrace: this one + will complain that it can't find /path/to/repository. Edit + your hgweb.cgi script + and replace the /path/to/repository string + with the complete path to the repository you want to serve + up. + + At this point, when you try to reload the page, you + should be presented with a nice HTML view of your + repository's history. Whew! + + + + Configuring lighttpd + + To be exhaustive in my experiments, I tried configuring + the increasingly popular lighttpd web + server to serve the same repository as I described with + Apache above. I had already overcome all of the problems I + outlined with Apache, many of which are not server-specific. + As a result, I was fairly sure that my file and directory + permissions were good, and that my hgweb.cgi script was properly + edited. + + Once I had Apache running, getting + lighttpd to serve the repository was a + snap (in other words, even if you're trying to use + lighttpd, you should read the Apache + section). I first had to edit the + mod_access section of its config file to + enable mod_cgi and + mod_userdir, both of which were disabled + by default on my system. I then added a few lines to the + end of the config file, to configure these modules. + userdir.path = "public_html" +cgi.assign = (".cgi" => "" ) + With this done, lighttpd ran + immediately for me. If I had configured + lighttpd before Apache, I'd almost + certainly have run into many of the same system-level + configuration problems as I did with Apache. However, I + found lighttpd to be noticeably easier to + configure than Apache, even though I've used Apache for over + a decade, and this was my first exposure to + lighttpd. + + + + + Sharing multiple repositories with one CGI script + + The hgweb.cgi script + only lets you publish a single repository, which is an + annoying restriction. If you want to publish more than one + without wracking yourself with multiple copies of the same + script, each with different names, a better choice is to use + the hgwebdir.cgi + script. + + The procedure to configure hgwebdir.cgi is only a little more + involved than for hgweb.cgi. First, you must obtain + a copy of the script. If you don't have one handy, you can + download a copy from the master Mercurial repository at http://www.selenic.com/repo/hg/raw-file/tip/hgwebdir.cgi. + + You'll need to copy this script into your public_html directory, and + ensure that it's executable. + + cp .../hgwebdir.cgi ~/public_html +chmod 755 ~/public_html ~/public_html/hgwebdir.cgi + + With basic configuration out of the way, try to + visit http://myhostname/~myuser/hgwebdir.cgi + in your browser. It should + display an empty list of repositories. If you get a blank + window or error message, try walking through the list of + potential problems in . + + The hgwebdir.cgi + script relies on an external configuration file. By default, + it searches for a file named hgweb.config in the same directory + as itself. You'll need to create this file, and make it + world-readable. The format of the file is similar to a + Windows ini file, as understood by Python's + ConfigParser + web:configparser module. + + The easiest way to configure hgwebdir.cgi is with a section + named collections. This will automatically + publish every repository under the + directories you name. The section should look like + this: + [collections] +/my/root = /my/root + Mercurial interprets this by looking at the directory name + on the right hand side of the + = sign; finding repositories + in that directory hierarchy; and using the text on the + left to strip off matching text from the + names it will actually list in the web interface. The + remaining component of a path after this stripping has + occurred is called a virtual path. + + Given the example above, if we have a + repository whose local path is /my/root/this/repo, the CGI + script will strip the leading /my/root from the name, and + publish the repository with a virtual path of this/repo. If the base URL for + our CGI script is + http://myhostname/~myuser/hgwebdir.cgi, the + complete URL for that repository will be + http://myhostname/~myuser/hgwebdir.cgi/this/repo. + + If we replace /my/root on the left hand side + of this example with /my, then hgwebdir.cgi will only strip off + /my from the repository + name, and will give us a virtual path of root/this/repo instead of + this/repo. + + The hgwebdir.cgi + script will recursively search each directory listed in the + collections section of its configuration + file, but it will not recurse into the + repositories it finds. + + The collections mechanism makes it easy + to publish many repositories in a fire and + forget manner. You only need to set up the CGI + script and configuration file one time. Afterwards, you can + publish or unpublish a repository at any time by simply moving + it into, or out of, the directory hierarchy in which you've + configured hgwebdir.cgi to + look. + + + Explicitly specifying which repositories to + publish + + In addition to the collections + mechanism, the hgwebdir.cgi script allows you + to publish a specific list of repositories. To do so, + create a paths section, with contents of + the following form. + [paths] +repo1 = /my/path/to/some/repo +repo2 = /some/path/to/another + In this case, the virtual path (the component that will + appear in a URL) is on the left hand side of each + definition, while the path to the repository is on the + right. Notice that there does not need to be any + relationship between the virtual path you choose and the + location of a repository in your filesystem. + + If you wish, you can use both the + collections and paths + mechanisms simultaneously in a single configuration + file. + + + Beware duplicate virtual paths + + If several repositories have the same + virtual path, hgwebdir.cgi will not report + an error. Instead, it will behave unpredictably. + + + + + + Downloading source archives + + Mercurial's web interface lets users download an archive + of any revision. This archive will contain a snapshot of the + working directory as of that revision, but it will not contain + a copy of the repository data. + + By default, this feature is not enabled. To enable it, + you'll need to add an allow_archive item to the + web section of your ~/.hgrc; see below for details. + + + Web configuration options + + Mercurial's web interfaces (the hg + serve command, and the hgweb.cgi and hgwebdir.cgi scripts) have a + number of configuration options that you can set. These + belong in a section named web. + + allow_archive: Determines + which (if any) archive download mechanisms Mercurial + supports. If you enable this feature, users of the web + interface will be able to download an archive of whatever + revision of a repository they are viewing. To enable the + archive feature, this item must take the form of a + sequence of words drawn from the list below. + + bz2: A + tar archive, compressed using + bzip2 compression. This has the + best compression ratio, but uses the most CPU time on + the server. + + gz: A + tar archive, compressed using + gzip compression. + + zip: A + zip archive, compressed using LZW + compression. This format has the worst compression + ratio, but is widely used in the Windows world. + + + If you provide an empty list, or don't have an + allow_archive entry at + all, this feature will be disabled. Here is an example of + how to enable all three supported formats. + [web] +allow_archive = bz2 gz zip + + allowpull: + Boolean. Determines whether the web interface allows + remote users to hg pull + and hg clone this + repository over HTTP. If set to no or + false, only the + human-oriented portion of the web interface + is available. + + contact: + String. A free-form (but preferably brief) string + identifying the person or group in charge of the + repository. This often contains the name and email + address of a person or mailing list. It often makes sense + to place this entry in a repository's own .hg/hgrc file, but it can make + sense to use in a global ~/.hgrc if every repository + has a single maintainer. + + maxchanges: + Integer. The default maximum number of changesets to + display in a single page of output. + + maxfiles: + Integer. The default maximum number of modified files to + display in a single page of output. + + stripes: + Integer. If the web interface displays alternating + stripes to make it easier to visually align + rows when you are looking at a table, this number controls + the number of rows in each stripe. + + style: Controls the template + Mercurial uses to display the web interface. Mercurial + ships with several web templates. + + + coal is monochromatic. + + + gitweb emulates the visual + style of git's web interface. + + + monoblue uses solid blues and + greys. + + + paper is the default. + + + spartan was the default for a + long time. + + + You can + also specify a custom template of your own; see + for details. Here, you can + see how to enable the gitweb + style. + [web] +style = gitweb + + templates: + Path. The directory in which to search for template + files. By default, Mercurial searches in the directory in + which it was installed. + + If you are using hgwebdir.cgi, you can place a few + configuration items in a web + section of the hgweb.config file instead of a + ~/.hgrc file, for + convenience. These items are motd and style. + + + Options specific to an individual repository + + A few web configuration + items ought to be placed in a repository's local .hg/hgrc, rather than a user's + or global ~/.hgrc. + + description: String. A + free-form (but preferably brief) string that describes + the contents or purpose of the repository. + + name: + String. The name to use for the repository in the web + interface. This overrides the default name, which is + the last component of the repository's path. + + + + + Options specific to the <command role="hg-cmd">hg + serve</command> command + + Some of the items in the web section of a ~/.hgrc file are only for use + with the hg serve + command. + + accesslog: + Path. The name of a file into which to write an access + log. By default, the hg + serve command writes this information to + standard output, not to a file. Log entries are written + in the standard combined file format used + by almost all web servers. + + address: + String. The local address on which the server should + listen for incoming connections. By default, the server + listens on all addresses. + + errorlog: + Path. The name of a file into which to write an error + log. By default, the hg + serve command writes this information to + standard error, not to a file. + + ipv6: + Boolean. Whether to use the IPv6 protocol. By default, + IPv6 is not used. + + port: + Integer. The TCP port number on which the server should + listen. The default port number used is 8000. + + + + + Choosing the right <filename + role="special">~/.hgrc</filename> file to add <literal + role="rc-web">web</literal> items to + + It is important to remember that a web server like + Apache or lighttpd will run under a user + ID that is different to yours. CGI scripts run by your + server, such as hgweb.cgi, will usually also run + under that user ID. + + If you add web items to + your own personal ~/.hgrc file, CGI scripts won't read that + ~/.hgrc file. Those + settings will thus only affect the behavior of the hg serve command when you run it. + To cause CGI scripts to see your settings, either create a + ~/.hgrc file in the + home directory of the user ID that runs your web server, or + add those settings to a system-wide hgrc file. + + + + + + System-wide configuration + + On Unix-like systems shared by multiple users (such as a + server to which people publish changes), it often makes sense to + set up some global default behaviors, such as what theme to use + in web interfaces. + + If a file named /etc/mercurial/hgrc + exists, Mercurial will read it at startup time and apply any + configuration settings it finds in that file. It will also look + for files ending in a .rc extension in a + directory named /etc/mercurial/hgrc.d, and + apply any configuration settings it finds in each of those + files. + + + Making Mercurial more trusting + + One situation in which a global hgrc + can be useful is if users are pulling changes owned by other + users. By default, Mercurial will not trust most of the + configuration items in a .hg/hgrc file + inside a repository that is owned by a different user. If we + clone or pull changes from such a repository, Mercurial will + print a warning stating that it does not trust their + .hg/hgrc. + + If everyone in a particular Unix group is on the same team + and should trust each other's + configuration settings, or we want to trust particular users, + we can override Mercurial's skeptical defaults by creating a + system-wide hgrc file such as the + following: + + # Save this as e.g. /etc/mercurial/hgrc.d/trust.rc +[trusted] +# Trust all entries in any hgrc file owned by the "editors" or +# "www-data" groups. +groups = editors, www-data + +# Trust entries in hgrc files owned by the following users. +users = apache, bobo + + + +
+ + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch06-filenames.xml --- a/en/ch06-filenames.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,450 +0,0 @@ - - - - - File names and pattern matching - - Mercurial provides mechanisms that let you work with file - names in a consistent and expressive way. - - - Simple file naming - - Mercurial uses a unified piece of machinery under the - hood to handle file names. Every command behaves - uniformly with respect to file names. The way in which commands - work with file names is as follows. - - If you explicitly name real files on the command line, - Mercurial works with exactly those files, as you would expect. - &interaction.filenames.files; - - When you provide a directory name, Mercurial will interpret - this as operate on every file in this directory and its - subdirectories. Mercurial traverses the files and - subdirectories in a directory in alphabetical order. When it - encounters a subdirectory, it will traverse that subdirectory - before continuing with the current directory. - - &interaction.filenames.dirs; - - - - Running commands without any file names - - Mercurial's commands that work with file names have useful - default behaviors when you invoke them without providing any - file names or patterns. What kind of behavior you should - expect depends on what the command does. Here are a few rules - of thumb you can use to predict what a command is likely to do - if you don't give it any names to work with. - - Most commands will operate on the entire working - directory. This is what the hg - add command does, for example. - - If the command has effects that are difficult or - impossible to reverse, it will force you to explicitly - provide at least one name or pattern (see below). This - protects you from accidentally deleting files by running - hg remove with no - arguments, for example. - - - It's easy to work around these default behaviors if they - don't suit you. If a command normally operates on the whole - working directory, you can invoke it on just the current - directory and its subdirectories by giving it the name - .. - - &interaction.filenames.wdir-subdir; - - Along the same lines, some commands normally print file - names relative to the root of the repository, even if you're - invoking them from a subdirectory. Such a command will print - file names relative to your subdirectory if you give it explicit - names. Here, we're going to run hg - status from a subdirectory, and get it to operate on - the entire working directory while printing file names relative - to our subdirectory, by passing it the output of the hg root command. - - &interaction.filenames.wdir-relname; - - - - Telling you what's going on - - The hg add example in the - preceding section illustrates something else that's helpful - about Mercurial commands. If a command operates on a file that - you didn't name explicitly on the command line, it will usually - print the name of the file, so that you will not be surprised - what's going on. - - The principle here is of least - surprise. If you've exactly named a file on the - command line, there's no point in repeating it back at you. If - Mercurial is acting on a file implicitly, e.g. - because you provided no names, or a directory, or a pattern (see - below), it is safest to tell you what files it's operating on. - - For commands that behave this way, you can silence them - using the option. You - can also get them to print the name of every file, even those - you've named explicitly, using the option. - - - - Using patterns to identify files - - In addition to working with file and directory names, - Mercurial lets you use patterns to identify - files. Mercurial's pattern handling is expressive. - - On Unix-like systems (Linux, MacOS, etc.), the job of - matching file names to patterns normally falls to the shell. On - these systems, you must explicitly tell Mercurial that a name is - a pattern. On Windows, the shell does not expand patterns, so - Mercurial will automatically identify names that are patterns, - and expand them for you. - - To provide a pattern in place of a regular name on the - command line, the mechanism is simple: - syntax:patternbody - That is, a pattern is identified by a short text string that - says what kind of pattern this is, followed by a colon, followed - by the actual pattern. - - Mercurial supports two kinds of pattern syntax. The most - frequently used is called glob; this is the - same kind of pattern matching used by the Unix shell, and should - be familiar to Windows command prompt users, too. - - When Mercurial does automatic pattern matching on Windows, - it uses glob syntax. You can thus omit the - glob: prefix on Windows, but - it's safe to use it, too. - - The re syntax is more powerful; it lets - you specify patterns using regular expressions, also known as - regexps. - - By the way, in the examples that follow, notice that I'm - careful to wrap all of my patterns in quote characters, so that - they won't get expanded by the shell before Mercurial sees - them. - - - Shell-style <literal>glob</literal> patterns - - This is an overview of the kinds of patterns you can use - when you're matching on glob patterns. - - The * character matches - any string, within a single directory. - - &interaction.filenames.glob.star; - - The ** pattern matches - any string, and crosses directory boundaries. It's not a - standard Unix glob token, but it's accepted by several popular - Unix shells, and is very useful. - - &interaction.filenames.glob.starstar; - - The ? pattern matches - any single character. - - &interaction.filenames.glob.question; - - The [ character begins a - character class. This matches any single - character within the class. The class ends with a - ] character. A class may - contain multiple ranges of the form - a-f, which is shorthand for - abcdef. - - &interaction.filenames.glob.range; - - If the first character after the - [ in a character class is a - !, it - negates the class, making it match any - single character not in the class. - - A { begins a group of - subpatterns, where the whole group matches if any subpattern - in the group matches. The , - character separates subpatterns, and - } ends the group. - - &interaction.filenames.glob.group; - - - Watch out! - - Don't forget that if you want to match a pattern in any - directory, you should not be using the - * match-any token, as this - will only match within one directory. Instead, use the - ** token. This small - example illustrates the difference between the two. - - &interaction.filenames.glob.star-starstar; - - - - - Regular expression matching with <literal>re</literal> - patterns - - Mercurial accepts the same regular expression syntax as - the Python programming language (it uses Python's regexp - engine internally). This is based on the Perl language's - regexp syntax, which is the most popular dialect in use (it's - also used in Java, for example). - - I won't discuss Mercurial's regexp dialect in any detail - here, as regexps are not often used. Perl-style regexps are - in any case already exhaustively documented on a multitude of - web sites, and in many books. Instead, I will focus here on a - few things you should know if you find yourself needing to use - regexps with Mercurial. - - A regexp is matched against an entire file name, relative - to the root of the repository. In other words, even if you're - already in subbdirectory foo, if you want to match files - under this directory, your pattern must start with - foo/. - - One thing to note, if you're familiar with Perl-style - regexps, is that Mercurial's are rooted. - That is, a regexp starts matching against the beginning of a - string; it doesn't look for a match anywhere within the - string. To match anywhere in a string, start your pattern - with .*. - - - - - Filtering files - - Not only does Mercurial give you a variety of ways to - specify files; it lets you further winnow those files using - filters. Commands that work with file - names accept two filtering options. - - , or - , lets you - specify a pattern that file names must match in order to be - processed. - - , or - , gives you a - way to avoid processing files, if they - match this pattern. - - You can provide multiple and options on the command line, - and intermix them as you please. Mercurial interprets the - patterns you provide using glob syntax by default (but you can - use regexps if you need to). - - You can read a - filter as process only the files that match this - filter. - - &interaction.filenames.filter.include; - - The filter is best - read as process only the files that don't match this - pattern. - - &interaction.filenames.filter.exclude; - - - - Permanently ignoring unwanted files and directories - - When you create a new repository, the chances are - that over time it will grow to contain files that ought to - not be managed by Mercurial, but which you - don't want to see listed every time you run hg - status. For instance, build products - are files that are created as part of a build but which should - not be managed by a revision control system. The most common - build products are output files produced by software tools such - as compilers. As another example, many text editors litter a - directory with lock files, temporary working files, and backup - files, which it also makes no sense to manage. - - To have Mercurial permanently ignore such files, create a - file named .hgignore in the root of your - repository. You should hg - add this file so that it gets tracked with the rest of - your repository contents, since your collaborators will probably - find it useful too. - - By default, the .hgignore file should - contain a list of regular expressions, one per line. Empty - lines are skipped. Most people prefer to describe the files they - want to ignore using the glob syntax that we - described above, so a typical .hgignore - file will start with this directive: - - syntax: glob - - This tells Mercurial to interpret the lines that follow as - glob patterns, not regular expressions. - - Here is a typical-looking .hgignore - file. - - syntax: glob -# This line is a comment, and will be skipped. -# Empty lines are skipped too. - -# Backup files left behind by the Emacs editor. -*~ - -# Lock files used by the Emacs editor. -# Notice that the "#" character is quoted with a backslash. -# This prevents it from being interpreted as starting a comment. -.\#* - -# Temporary files used by the vim editor. -.*.swp - -# A hidden file created by the Mac OS X Finder. -.DS_Store - - - - - Case sensitivity - - If you're working in a mixed development environment that - contains both Linux (or other Unix) systems and Macs or Windows - systems, you should keep in the back of your mind the knowledge - that they treat the case (N versus - n) of file names in incompatible ways. This is - not very likely to affect you, and it's easy to deal with if it - does, but it could surprise you if you don't know about - it. - - Operating systems and filesystems differ in the way they - handle the case of characters in file and - directory names. There are three common ways to handle case in - names. - - Completely case insensitive. Uppercase and - lowercase versions of a letter are treated as identical, - both when creating a file and during subsequent accesses. - This is common on older DOS-based systems. - - Case preserving, but insensitive. When a file - or directory is created, the case of its name is stored, and - can be retrieved and displayed by the operating system. - When an existing file is being looked up, its case is - ignored. This is the standard arrangement on Windows and - MacOS. The names foo and - FoO identify the same file. This - treatment of uppercase and lowercase letters as - interchangeable is also referred to as case - folding. - - Case sensitive. The case of a name is - significant at all times. The names foo - and {FoO} identify different files. This is the way Linux - and Unix systems normally work. - - - On Unix-like systems, it is possible to have any or all of - the above ways of handling case in action at once. For example, - if you use a USB thumb drive formatted with a FAT32 filesystem - on a Linux system, Linux will handle names on that filesystem in - a case preserving, but insensitive, way. - - - Safe, portable repository storage - - Mercurial's repository storage mechanism is case - safe. It translates file names so that they can - be safely stored on both case sensitive and case insensitive - filesystems. This means that you can use normal file copying - tools to transfer a Mercurial repository onto, for example, a - USB thumb drive, and safely move that drive and repository - back and forth between a Mac, a PC running Windows, and a - Linux box. - - - - Detecting case conflicts - - When operating in the working directory, Mercurial honours - the naming policy of the filesystem where the working - directory is located. If the filesystem is case preserving, - but insensitive, Mercurial will treat names that differ only - in case as the same. - - An important aspect of this approach is that it is - possible to commit a changeset on a case sensitive (typically - Linux or Unix) filesystem that will cause trouble for users on - case insensitive (usually Windows and MacOS) users. If a - Linux user commits changes to two files, one named - myfile.c and the other named - MyFile.C, they will be stored correctly - in the repository. And in the working directories of other - Linux users, they will be correctly represented as separate - files. - - If a Windows or Mac user pulls this change, they will not - initially have a problem, because Mercurial's repository - storage mechanism is case safe. However, once they try to - hg update the working - directory to that changeset, or hg - merge with that changeset, Mercurial will spot the - conflict between the two file names that the filesystem would - treat as the same, and forbid the update or merge from - occurring. - - - - Fixing a case conflict - - If you are using Windows or a Mac in a mixed environment - where some of your collaborators are using Linux or Unix, and - Mercurial reports a case folding conflict when you try to - hg update or hg merge, the procedure to fix the - problem is simple. - - Just find a nearby Linux or Unix box, clone the problem - repository onto it, and use Mercurial's hg rename command to change the - names of any offending files or directories so that they will - no longer cause case folding conflicts. Commit this change, - hg pull or hg push it across to your Windows or - MacOS system, and hg update - to the revision with the non-conflicting names. - - The changeset with case-conflicting names will remain in - your project's history, and you still won't be able to - hg update your working - directory to that changeset on a Windows or MacOS system, but - you can continue development unimpeded. - - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch07-branch.xml --- a/en/ch07-branch.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,532 +0,0 @@ - - - - - Managing releases and branchy development - - Mercurial provides several mechanisms for you to manage a - project that is making progress on multiple fronts at once. To - understand these mechanisms, let's first take a brief look at a - fairly normal software project structure. - - Many software projects issue periodic major - releases that contain substantial new features. In parallel, they - may issue minor releases. These are usually - identical to the major releases off which they're based, but with - a few bugs fixed. - - In this chapter, we'll start by talking about how to keep - records of project milestones such as releases. We'll then - continue on to talk about the flow of work between different - phases of a project, and how Mercurial can help you to isolate and - manage this work. - - - Giving a persistent name to a revision - - Once you decide that you'd like to call a particular - revision a release, it's a good idea to record - the identity of that revision. This will let you reproduce that - release at a later date, for whatever purpose you might need at - the time (reproducing a bug, porting to a new platform, etc). - &interaction.tag.init; - - Mercurial lets you give a permanent name to any revision - using the hg tag command. Not - surprisingly, these names are called tags. - - &interaction.tag.tag; - - A tag is nothing more than a symbolic name - for a revision. Tags exist purely for your convenience, so that - you have a handy permanent way to refer to a revision; Mercurial - doesn't interpret the tag names you use in any way. Neither - does Mercurial place any restrictions on the name of a tag, - beyond a few that are necessary to ensure that a tag can be - parsed unambiguously. A tag name cannot contain any of the - following characters: - - Colon (ASCII 58, - :) - - Carriage return (ASCII 13, - \r) - - Newline (ASCII 10, - \n) - - - You can use the hg tags - command to display the tags present in your repository. In the - output, each tagged revision is identified first by its name, - then by revision number, and finally by the unique hash of the - revision. - - &interaction.tag.tags; - - Notice that tip is listed in the output - of hg tags. The - tip tag is a special floating - tag, which always identifies the newest revision in the - repository. - - In the output of the hg - tags command, tags are listed in reverse order, by - revision number. This usually means that recent tags are listed - before older tags. It also means that tip is - always going to be the first tag listed in the output of - hg tags. - - When you run hg log, if it - displays a revision that has tags associated with it, it will - print those tags. - - &interaction.tag.log; - - Any time you need to provide a revision ID to a Mercurial - command, the command will accept a tag name in its place. - Internally, Mercurial will translate your tag name into the - corresponding revision ID, then use that. - - &interaction.tag.log.v1.0; - - There's no limit on the number of tags you can have in a - repository, or on the number of tags that a single revision can - have. As a practical matter, it's not a great idea to have - too many (a number which will vary from project - to project), simply because tags are supposed to help you to - find revisions. If you have lots of tags, the ease of using - them to identify revisions diminishes rapidly. - - For example, if your project has milestones as frequent as - every few days, it's perfectly reasonable to tag each one of - those. But if you have a continuous build system that makes - sure every revision can be built cleanly, you'd be introducing a - lot of noise if you were to tag every clean build. Instead, you - could tag failed builds (on the assumption that they're rare!), - or simply not use tags to track buildability. - - If you want to remove a tag that you no longer want, use - hg tag --remove. - - &interaction.tag.remove; - - You can also modify a tag at any time, so that it identifies - a different revision, by simply issuing a new hg tag command. You'll have to use the - option to tell Mercurial - that you really want to update the - tag. - - &interaction.tag.replace; - - There will still be a permanent record of the previous - identity of the tag, but Mercurial will no longer use it. - There's thus no penalty to tagging the wrong revision; all you - have to do is turn around and tag the correct revision once you - discover your error. - - Mercurial stores tags in a normal revision-controlled file - in your repository. If you've created any tags, you'll find - them in a file in the root of your repository named .hgtags. When you run the hg tag command, Mercurial modifies - this file, then automatically commits the change to it. This - means that every time you run hg - tag, you'll see a corresponding changeset in the - output of hg log. - - &interaction.tag.tip; - - - Handling tag conflicts during a merge - - You won't often need to care about the .hgtags file, but it sometimes - makes its presence known during a merge. The format of the - file is simple: it consists of a series of lines. Each line - starts with a changeset hash, followed by a space, followed by - the name of a tag. - - If you're resolving a conflict in the .hgtags file during a merge, - there's one twist to modifying the .hgtags file: when Mercurial is - parsing the tags in a repository, it - never reads the working copy of the - .hgtags file. Instead, it - reads the most recently committed - revision of the file. - - An unfortunate consequence of this design is that you - can't actually verify that your merged .hgtags file is correct until - after you've committed a change. So if - you find yourself resolving a conflict on .hgtags during a merge, be sure to - run hg tags after you commit. - If it finds an error in the .hgtags file, it will report the - location of the error, which you can then fix and commit. You - should then run hg tags - again, just to be sure that your fix is correct. - - - - Tags and cloning - - You may have noticed that the hg - clone command has a option that lets you clone - an exact copy of the repository as of a particular changeset. - The new clone will not contain any project history that comes - after the revision you specified. This has an interaction - with tags that can surprise the unwary. - - Recall that a tag is stored as a revision to the .hgtags file, so that when you - create a tag, the changeset in which it's recorded necessarily - refers to an older changeset. When you run hg clone -r foo to clone a - repository as of tag foo, the new clone - will not contain the history that created the - tag that you used to clone the repository. The - result is that you'll get exactly the right subset of the - project's history in the new repository, but - not the tag you might have - expected. - - - - When permanent tags are too much - - Since Mercurial's tags are revision controlled and carried - around with a project's history, everyone you work with will - see the tags you create. But giving names to revisions has - uses beyond simply noting that revision - 4237e45506ee is really - v2.0.2. If you're trying to track down a - subtle bug, you might want a tag to remind you of something - like Anne saw the symptoms with this - revision. - - For cases like this, what you might want to use are - local tags. You can create a local tag - with the option to the - hg tag command. This will - store the tag in a file called .hg/localtags. Unlike .hgtags, .hg/localtags is not revision - controlled. Any tags you create using remain strictly local to the - repository you're currently working in. - - - - - The flow of changes&emdash;big picture vs. little - - To return to the outline I sketched at the beginning of a - chapter, let's think about a project that has multiple - concurrent pieces of work under development at once. - - There might be a push for a new main release; - a new minor bugfix release to the last main release; and an - unexpected hot fix to an old release that is now - in maintenance mode. - - The usual way people refer to these different concurrent - directions of development is as branches. - However, we've already seen numerous times that Mercurial treats - all of history as a series of branches and - merges. Really, what we have here is two ideas that are - peripherally related, but which happen to share a name. - - Big picture branches represent - the sweep of a project's evolution; people give them names, - and talk about them in conversation. - - Little picture branches are - artefacts of the day-to-day activity of developing and - merging changes. They expose the narrative of how the code - was developed. - - - - - Managing big-picture branches in repositories - - The easiest way to isolate a big picture - branch in Mercurial is in a dedicated repository. If you have - an existing shared repository&emdash;let's call it - myproject&emdash;that reaches a - 1.0 milestone, you can start to prepare for - future maintenance releases on top of version 1.0 by tagging the - revision from which you prepared the 1.0 release. - - &interaction.branch-repo.tag; - - You can then clone a new shared - myproject-1.0.1 repository as of that - tag. - - &interaction.branch-repo.clone; - - Afterwards, if someone needs to work on a bug fix that ought - to go into an upcoming 1.0.1 minor release, they clone the - myproject-1.0.1 repository, make their - changes, and push them back. - - &interaction.branch-repo.bugfix; - - Meanwhile, development for - the next major release can continue, isolated and unabated, in - the myproject repository. - - &interaction.branch-repo.new; - - - - Don't repeat yourself: merging across branches - - In many cases, if you have a bug to fix on a maintenance - branch, the chances are good that the bug exists on your - project's main branch (and possibly other maintenance branches, - too). It's a rare developer who wants to fix the same bug - multiple times, so let's look at a few ways that Mercurial can - help you to manage these bugfixes without duplicating your - work. - - In the simplest instance, all you need to do is pull changes - from your maintenance branch into your local clone of the target - branch. - - &interaction.branch-repo.pull; - - You'll then need to merge the heads of the two branches, and - push back to the main branch. - - &interaction.branch-repo.merge; - - - - Naming branches within one repository - - In most instances, isolating branches in repositories is the - right approach. Its simplicity makes it easy to understand; and - so it's hard to make mistakes. There's a one-to-one - relationship between branches you're working in and directories - on your system. This lets you use normal (non-Mercurial-aware) - tools to work on files within a branch/repository. - - If you're more in the power user category - (and your collaborators are too), there is - an alternative way of handling branches that you can consider. - I've already mentioned the human-level distinction between - small picture and big picture - branches. While Mercurial works with multiple small - picture branches in a repository all the time (for - example after you pull changes in, but before you merge them), - it can also work with multiple big - picture branches. - - The key to working this way is that Mercurial lets you - assign a persistent name to a branch. - There always exists a branch named default. - Even before you start naming branches yourself, you can find - traces of the default branch if you look for - them. - - As an example, when you run the hg - commit command, and it pops up your editor so that - you can enter a commit message, look for a line that contains - the text HG: branch default at - the bottom. This is telling you that your commit will occur on - the branch named default. - - To start working with named branches, use the hg branches command. This command - lists the named branches already present in your repository, - telling you which changeset is the tip of each. - - &interaction.branch-named.branches; - - Since you haven't created any named branches yet, the only - one that exists is default. - - To find out what the current branch is, run - the hg branch command, giving - it no arguments. This tells you what branch the parent of the - current changeset is on. - - &interaction.branch-named.branch; - - To create a new branch, run the hg - branch command again. This time, give it one - argument: the name of the branch you want to create. - - &interaction.branch-named.create; - - After you've created a branch, you might wonder what effect - the hg branch command has had. - What do the hg status and - hg tip commands report? - - &interaction.branch-named.status; - - Nothing has changed in the - working directory, and there's been no new history created. As - this suggests, running the hg - branch command has no permanent effect; it only - tells Mercurial what branch name to use the - next time you commit a changeset. - - When you commit a change, Mercurial records the name of the - branch on which you committed. Once you've switched from the - default branch to another and committed, - you'll see the name of the new branch show up in the output of - hg log, hg tip, and other commands that - display the same kind of output. - - &interaction.branch-named.commit; - - The hg log-like commands - will print the branch name of every changeset that's not on the - default branch. As a result, if you never - use named branches, you'll never see this information. - - Once you've named a branch and committed a change with that - name, every subsequent commit that descends from that change - will inherit the same branch name. You can change the name of a - branch at any time, using the hg - branch command. - - &interaction.branch-named.rebranch; - - In practice, this is something you won't do very often, as - branch names tend to have fairly long lifetimes. (This isn't a - rule, just an observation.) - - - - Dealing with multiple named branches in a - repository - - If you have more than one named branch in a repository, - Mercurial will remember the branch that your working directory - on when you start a command like hg - update or hg pull - -u. It will update the working directory to the tip - of this branch, no matter what the repo-wide tip - is. To update to a revision that's on a different named branch, - you may need to use the - option to hg update. - - This behavior is a little subtle, so let's see it in - action. First, let's remind ourselves what branch we're - currently on, and what branches are in our repository. - - &interaction.branch-named.parents; - - We're on the bar branch, but there also - exists an older hg foo - branch. - - We can hg update back and - forth between the tips of the foo and - bar branches without needing to use the - option, because this - only involves going backwards and forwards linearly through our - change history. - - &interaction.branch-named.update-switchy; - - If we go back to the foo branch and then - run hg update, it will keep us - on foo, not move us to the tip of - bar. - - &interaction.branch-named.update-nothing; - - Committing a new change on the foo branch - introduces a new head. - - &interaction.branch-named.foo-commit; - - - - Branch names and merging - - As you've probably noticed, merges in Mercurial are not - symmetrical. Let's say our repository has two heads, 17 and 23. - If I hg update to 17 and then - hg merge with 23, Mercurial - records 17 as the first parent of the merge, and 23 as the - second. Whereas if I hg update - to 23 and then hg merge with - 17, it records 23 as the first parent, and 17 as the - second. - - This affects Mercurial's choice of branch name when you - merge. After a merge, Mercurial will retain the branch name of - the first parent when you commit the result of the merge. If - your first parent's branch name is foo, and - you merge with bar, the branch name will - still be foo after you merge. - - It's not unusual for a repository to contain multiple heads, - each with the same branch name. Let's say I'm working on the - foo branch, and so are you. We commit - different changes; I pull your changes; I now have two heads, - each claiming to be on the foo branch. The - result of a merge will be a single head on the - foo branch, as you might hope. - - But if I'm working on the bar branch, and - I merge work from the foo branch, the result - will remain on the bar branch. - - &interaction.branch-named.merge; - - To give a more concrete example, if I'm working on the - bleeding-edge branch, and I want to bring in - the latest fixes from the stable branch, - Mercurial will choose the right - (bleeding-edge) branch name when I pull and - merge from stable. - - - - Branch naming is generally useful - - You shouldn't think of named branches as applicable only to - situations where you have multiple long-lived branches - cohabiting in a single repository. They're very useful even in - the one-branch-per-repository case. - - In the simplest case, giving a name to each branch gives you - a permanent record of which branch a changeset originated on. - This gives you more context when you're trying to follow the - history of a long-lived branchy project. - - If you're working with shared repositories, you can set up a - pretxnchangegroup hook on each - that will block incoming changes that have the - wrong branch name. This provides a simple, but - effective, defence against people accidentally pushing changes - from a bleeding edge branch to a - stable branch. Such a hook might look like this - inside the shared repo's - /.hgrc. - [hooks] -pretxnchangegroup.branch = hg heads --template '{branches} ' | grep mybranch - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch07-filenames.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch07-filenames.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,451 @@ + + + + + File names and pattern matching + + Mercurial provides mechanisms that let you work with file + names in a consistent and expressive way. + + + Simple file naming + + Mercurial uses a unified piece of machinery under the + hood to handle file names. Every command behaves + uniformly with respect to file names. The way in which commands + work with file names is as follows. + + If you explicitly name real files on the command line, + Mercurial works with exactly those files, as you would expect. + &interaction.filenames.files; + + When you provide a directory name, Mercurial will interpret + this as operate on every file in this directory and its + subdirectories. Mercurial traverses the files and + subdirectories in a directory in alphabetical order. When it + encounters a subdirectory, it will traverse that subdirectory + before continuing with the current directory. + + &interaction.filenames.dirs; + + + + Running commands without any file names + + Mercurial's commands that work with file names have useful + default behaviors when you invoke them without providing any + file names or patterns. What kind of behavior you should + expect depends on what the command does. Here are a few rules + of thumb you can use to predict what a command is likely to do + if you don't give it any names to work with. + + Most commands will operate on the entire working + directory. This is what the hg + add command does, for example. + + If the command has effects that are difficult or + impossible to reverse, it will force you to explicitly + provide at least one name or pattern (see below). This + protects you from accidentally deleting files by running + hg remove with no + arguments, for example. + + + It's easy to work around these default behaviors if they + don't suit you. If a command normally operates on the whole + working directory, you can invoke it on just the current + directory and its subdirectories by giving it the name + .. + + &interaction.filenames.wdir-subdir; + + Along the same lines, some commands normally print file + names relative to the root of the repository, even if you're + invoking them from a subdirectory. Such a command will print + file names relative to your subdirectory if you give it explicit + names. Here, we're going to run hg + status from a subdirectory, and get it to operate on + the entire working directory while printing file names relative + to our subdirectory, by passing it the output of the hg root command. + + &interaction.filenames.wdir-relname; + + + + Telling you what's going on + + The hg add example in the + preceding section illustrates something else that's helpful + about Mercurial commands. If a command operates on a file that + you didn't name explicitly on the command line, it will usually + print the name of the file, so that you will not be surprised + what's going on. + + The principle here is of least + surprise. If you've exactly named a file on the + command line, there's no point in repeating it back at you. If + Mercurial is acting on a file implicitly, e.g. + because you provided no names, or a directory, or a pattern (see + below), it is safest to tell you what files it's operating on. + + For commands that behave this way, you can silence them + using the option. You + can also get them to print the name of every file, even those + you've named explicitly, using the option. + + + + Using patterns to identify files + + In addition to working with file and directory names, + Mercurial lets you use patterns to identify + files. Mercurial's pattern handling is expressive. + + On Unix-like systems (Linux, MacOS, etc.), the job of + matching file names to patterns normally falls to the shell. On + these systems, you must explicitly tell Mercurial that a name is + a pattern. On Windows, the shell does not expand patterns, so + Mercurial will automatically identify names that are patterns, + and expand them for you. + + To provide a pattern in place of a regular name on the + command line, the mechanism is simple: + syntax:patternbody + That is, a pattern is identified by a short text string that + says what kind of pattern this is, followed by a colon, followed + by the actual pattern. + + Mercurial supports two kinds of pattern syntax. The most + frequently used is called glob; this is the + same kind of pattern matching used by the Unix shell, and should + be familiar to Windows command prompt users, too. + + When Mercurial does automatic pattern matching on Windows, + it uses glob syntax. You can thus omit the + glob: prefix on Windows, but + it's safe to use it, too. + + The re syntax is more powerful; it lets + you specify patterns using regular expressions, also known as + regexps. + + By the way, in the examples that follow, notice that I'm + careful to wrap all of my patterns in quote characters, so that + they won't get expanded by the shell before Mercurial sees + them. + + + Shell-style <literal>glob</literal> patterns + + This is an overview of the kinds of patterns you can use + when you're matching on glob patterns. + + The * character matches + any string, within a single directory. + + &interaction.filenames.glob.star; + + The ** pattern matches + any string, and crosses directory boundaries. It's not a + standard Unix glob token, but it's accepted by several popular + Unix shells, and is very useful. + + &interaction.filenames.glob.starstar; + + The ? pattern matches + any single character. + + &interaction.filenames.glob.question; + + The [ character begins a + character class. This matches any single + character within the class. The class ends with a + ] character. A class may + contain multiple ranges of the form + a-f, which is shorthand for + abcdef. + + &interaction.filenames.glob.range; + + If the first character after the + [ in a character class is a + !, it + negates the class, making it match any + single character not in the class. + + A { begins a group of + subpatterns, where the whole group matches if any subpattern + in the group matches. The , + character separates subpatterns, and + } ends the group. + + &interaction.filenames.glob.group; + + + Watch out! + + Don't forget that if you want to match a pattern in any + directory, you should not be using the + * match-any token, as this + will only match within one directory. Instead, use the + ** token. This small + example illustrates the difference between the two. + + &interaction.filenames.glob.star-starstar; + + + + + Regular expression matching with <literal>re</literal> + patterns + + Mercurial accepts the same regular expression syntax as + the Python programming language (it uses Python's regexp + engine internally). This is based on the Perl language's + regexp syntax, which is the most popular dialect in use (it's + also used in Java, for example). + + I won't discuss Mercurial's regexp dialect in any detail + here, as regexps are not often used. Perl-style regexps are + in any case already exhaustively documented on a multitude of + web sites, and in many books. Instead, I will focus here on a + few things you should know if you find yourself needing to use + regexps with Mercurial. + + A regexp is matched against an entire file name, relative + to the root of the repository. In other words, even if you're + already in subbdirectory foo, if you want to match files + under this directory, your pattern must start with + foo/. + + One thing to note, if you're familiar with Perl-style + regexps, is that Mercurial's are rooted. + That is, a regexp starts matching against the beginning of a + string; it doesn't look for a match anywhere within the + string. To match anywhere in a string, start your pattern + with .*. + + + + + Filtering files + + Not only does Mercurial give you a variety of ways to + specify files; it lets you further winnow those files using + filters. Commands that work with file + names accept two filtering options. + + , or + , lets you + specify a pattern that file names must match in order to be + processed. + + , or + , gives you a + way to avoid processing files, if they + match this pattern. + + You can provide multiple and options on the command line, + and intermix them as you please. Mercurial interprets the + patterns you provide using glob syntax by default (but you can + use regexps if you need to). + + You can read a + filter as process only the files that match this + filter. + + &interaction.filenames.filter.include; + + The filter is best + read as process only the files that don't match this + pattern. + + &interaction.filenames.filter.exclude; + + + + Permanently ignoring unwanted files and directories + + When you create a new repository, the chances are + that over time it will grow to contain files that ought to + not be managed by Mercurial, but which you + don't want to see listed every time you run hg + status. For instance, build products + are files that are created as part of a build but which should + not be managed by a revision control system. The most common + build products are output files produced by software tools such + as compilers. As another example, many text editors litter a + directory with lock files, temporary working files, and backup + files, which it also makes no sense to manage. + + To have Mercurial permanently ignore such files, create a + file named .hgignore in the root of your + repository. You should hg + add this file so that it gets tracked with the rest of + your repository contents, since your collaborators will probably + find it useful too. + + By default, the .hgignore file should + contain a list of regular expressions, one per line. Empty + lines are skipped. Most people prefer to describe the files they + want to ignore using the glob syntax that we + described above, so a typical .hgignore + file will start with this directive: + + syntax: glob + + This tells Mercurial to interpret the lines that follow as + glob patterns, not regular expressions. + + Here is a typical-looking .hgignore + file. + + syntax: glob +# This line is a comment, and will be skipped. +# Empty lines are skipped too. + +# Backup files left behind by the Emacs editor. +*~ + +# Lock files used by the Emacs editor. +# Notice that the "#" character is quoted with a backslash. +# This prevents it from being interpreted as starting a comment. +.\#* + +# Temporary files used by the vim editor. +.*.swp + +# A hidden file created by the Mac OS X Finder. +.DS_Store + + + + + Case sensitivity + + If you're working in a mixed development environment that + contains both Linux (or other Unix) systems and Macs or Windows + systems, you should keep in the back of your mind the knowledge + that they treat the case (N versus + n) of file names in incompatible ways. This is + not very likely to affect you, and it's easy to deal with if it + does, but it could surprise you if you don't know about + it. + + Operating systems and filesystems differ in the way they + handle the case of characters in file and + directory names. There are three common ways to handle case in + names. + + Completely case insensitive. Uppercase and + lowercase versions of a letter are treated as identical, + both when creating a file and during subsequent accesses. + This is common on older DOS-based systems. + + Case preserving, but insensitive. When a file + or directory is created, the case of its name is stored, and + can be retrieved and displayed by the operating system. + When an existing file is being looked up, its case is + ignored. This is the standard arrangement on Windows and + MacOS. The names foo and + FoO identify the same file. This + treatment of uppercase and lowercase letters as + interchangeable is also referred to as case + folding. + + Case sensitive. The case of a name + is significant at all times. The names + foo and FoO + identify different files. This is the way Linux and Unix + systems normally work. + + + On Unix-like systems, it is possible to have any or all of + the above ways of handling case in action at once. For example, + if you use a USB thumb drive formatted with a FAT32 filesystem + on a Linux system, Linux will handle names on that filesystem in + a case preserving, but insensitive, way. + + + Safe, portable repository storage + + Mercurial's repository storage mechanism is case + safe. It translates file names so that they can + be safely stored on both case sensitive and case insensitive + filesystems. This means that you can use normal file copying + tools to transfer a Mercurial repository onto, for example, a + USB thumb drive, and safely move that drive and repository + back and forth between a Mac, a PC running Windows, and a + Linux box. + + + + Detecting case conflicts + + When operating in the working directory, Mercurial honours + the naming policy of the filesystem where the working + directory is located. If the filesystem is case preserving, + but insensitive, Mercurial will treat names that differ only + in case as the same. + + An important aspect of this approach is that it is + possible to commit a changeset on a case sensitive (typically + Linux or Unix) filesystem that will cause trouble for users on + case insensitive (usually Windows and MacOS) users. If a + Linux user commits changes to two files, one named + myfile.c and the other named + MyFile.C, they will be stored correctly + in the repository. And in the working directories of other + Linux users, they will be correctly represented as separate + files. + + If a Windows or Mac user pulls this change, they will not + initially have a problem, because Mercurial's repository + storage mechanism is case safe. However, once they try to + hg update the working + directory to that changeset, or hg + merge with that changeset, Mercurial will spot the + conflict between the two file names that the filesystem would + treat as the same, and forbid the update or merge from + occurring. + + + + Fixing a case conflict + + If you are using Windows or a Mac in a mixed environment + where some of your collaborators are using Linux or Unix, and + Mercurial reports a case folding conflict when you try to + hg update or hg merge, the procedure to fix the + problem is simple. + + Just find a nearby Linux or Unix box, clone the problem + repository onto it, and use Mercurial's hg rename command to change the + names of any offending files or directories so that they will + no longer cause case folding conflicts. Commit this change, + hg pull or hg push it across to your Windows or + MacOS system, and hg update + to the revision with the non-conflicting names. + + The changeset with case-conflicting names will remain in + your project's history, and you still won't be able to + hg update your working + directory to that changeset on a Windows or MacOS system, but + you can continue development unimpeded. + + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch08-branch.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch08-branch.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,533 @@ + + + + + Managing releases and branchy development + + Mercurial provides several mechanisms for you to manage a + project that is making progress on multiple fronts at once. To + understand these mechanisms, let's first take a brief look at a + fairly normal software project structure. + + Many software projects issue periodic major + releases that contain substantial new features. In parallel, they + may issue minor releases. These are usually + identical to the major releases off which they're based, but with + a few bugs fixed. + + In this chapter, we'll start by talking about how to keep + records of project milestones such as releases. We'll then + continue on to talk about the flow of work between different + phases of a project, and how Mercurial can help you to isolate and + manage this work. + + + Giving a persistent name to a revision + + Once you decide that you'd like to call a particular + revision a release, it's a good idea to record + the identity of that revision. This will let you reproduce that + release at a later date, for whatever purpose you might need at + the time (reproducing a bug, porting to a new platform, etc). + &interaction.tag.init; + + Mercurial lets you give a permanent name to any revision + using the hg tag command. Not + surprisingly, these names are called tags. + + &interaction.tag.tag; + + A tag is nothing more than a symbolic name + for a revision. Tags exist purely for your convenience, so that + you have a handy permanent way to refer to a revision; Mercurial + doesn't interpret the tag names you use in any way. Neither + does Mercurial place any restrictions on the name of a tag, + beyond a few that are necessary to ensure that a tag can be + parsed unambiguously. A tag name cannot contain any of the + following characters: + + Colon (ASCII 58, + :) + + Carriage return (ASCII 13, + \r) + + Newline (ASCII 10, + \n) + + + You can use the hg tags + command to display the tags present in your repository. In the + output, each tagged revision is identified first by its name, + then by revision number, and finally by the unique hash of the + revision. + + &interaction.tag.tags; + + Notice that tip is listed in the output + of hg tags. The + tip tag is a special floating + tag, which always identifies the newest revision in the + repository. + + In the output of the hg + tags command, tags are listed in reverse order, by + revision number. This usually means that recent tags are listed + before older tags. It also means that tip is + always going to be the first tag listed in the output of + hg tags. + + When you run hg log, if it + displays a revision that has tags associated with it, it will + print those tags. + + &interaction.tag.log; + + Any time you need to provide a revision ID to a Mercurial + command, the command will accept a tag name in its place. + Internally, Mercurial will translate your tag name into the + corresponding revision ID, then use that. + + &interaction.tag.log.v1.0; + + There's no limit on the number of tags you can have in a + repository, or on the number of tags that a single revision can + have. As a practical matter, it's not a great idea to have + too many (a number which will vary from project + to project), simply because tags are supposed to help you to + find revisions. If you have lots of tags, the ease of using + them to identify revisions diminishes rapidly. + + For example, if your project has milestones as frequent as + every few days, it's perfectly reasonable to tag each one of + those. But if you have a continuous build system that makes + sure every revision can be built cleanly, you'd be introducing a + lot of noise if you were to tag every clean build. Instead, you + could tag failed builds (on the assumption that they're rare!), + or simply not use tags to track buildability. + + If you want to remove a tag that you no longer want, use + hg tag --remove. + + &interaction.tag.remove; + + You can also modify a tag at any time, so that it identifies + a different revision, by simply issuing a new hg tag command. You'll have to use the + option to tell Mercurial + that you really want to update the + tag. + + &interaction.tag.replace; + + There will still be a permanent record of the previous + identity of the tag, but Mercurial will no longer use it. + There's thus no penalty to tagging the wrong revision; all you + have to do is turn around and tag the correct revision once you + discover your error. + + Mercurial stores tags in a normal revision-controlled file + in your repository. If you've created any tags, you'll find + them in a file in the root of your repository named .hgtags. When you run the hg tag command, Mercurial modifies + this file, then automatically commits the change to it. This + means that every time you run hg + tag, you'll see a corresponding changeset in the + output of hg log. + + &interaction.tag.tip; + + + Handling tag conflicts during a merge + + You won't often need to care about the .hgtags file, but it sometimes + makes its presence known during a merge. The format of the + file is simple: it consists of a series of lines. Each line + starts with a changeset hash, followed by a space, followed by + the name of a tag. + + If you're resolving a conflict in the .hgtags file during a merge, + there's one twist to modifying the .hgtags file: when Mercurial is + parsing the tags in a repository, it + never reads the working copy of the + .hgtags file. Instead, it + reads the most recently committed + revision of the file. + + An unfortunate consequence of this design is that you + can't actually verify that your merged .hgtags file is correct until + after you've committed a change. So if + you find yourself resolving a conflict on .hgtags during a merge, be sure to + run hg tags after you commit. + If it finds an error in the .hgtags file, it will report the + location of the error, which you can then fix and commit. You + should then run hg tags + again, just to be sure that your fix is correct. + + + + Tags and cloning + + You may have noticed that the hg + clone command has a option that lets you clone + an exact copy of the repository as of a particular changeset. + The new clone will not contain any project history that comes + after the revision you specified. This has an interaction + with tags that can surprise the unwary. + + Recall that a tag is stored as a revision to + the .hgtags file. When you + create a tag, the changeset in which its recorded refers to an + older changeset. When you run hg clone + -r foo to clone a repository as of tag + foo, the new clone will not + contain any revision newer than the one the tag refers to, + including the revision where the tag was created. + The result is that you'll get exactly the right subset of the + project's history in the new repository, but + not the tag you might have + expected. + + + + When permanent tags are too much + + Since Mercurial's tags are revision controlled and carried + around with a project's history, everyone you work with will + see the tags you create. But giving names to revisions has + uses beyond simply noting that revision + 4237e45506ee is really + v2.0.2. If you're trying to track down a + subtle bug, you might want a tag to remind you of something + like Anne saw the symptoms with this + revision. + + For cases like this, what you might want to use are + local tags. You can create a local tag + with the option to the + hg tag command. This will + store the tag in a file called .hg/localtags. Unlike .hgtags, .hg/localtags is not revision + controlled. Any tags you create using remain strictly local to the + repository you're currently working in. + + + + + The flow of changes&emdash;big picture vs. little + + To return to the outline I sketched at the + beginning of the chapter, let's think about a project that has + multiple concurrent pieces of work under development at + once. + + There might be a push for a new main release; + a new minor bugfix release to the last main release; and an + unexpected hot fix to an old release that is now + in maintenance mode. + + The usual way people refer to these different concurrent + directions of development is as branches. + However, we've already seen numerous times that Mercurial treats + all of history as a series of branches and + merges. Really, what we have here is two ideas that are + peripherally related, but which happen to share a name. + + Big picture branches represent + the sweep of a project's evolution; people give them names, + and talk about them in conversation. + + Little picture branches are + artefacts of the day-to-day activity of developing and + merging changes. They expose the narrative of how the code + was developed. + + + + + Managing big-picture branches in repositories + + The easiest way to isolate a big picture + branch in Mercurial is in a dedicated repository. If you have + an existing shared repository&emdash;let's call it + myproject&emdash;that reaches a + 1.0 milestone, you can start to prepare for + future maintenance releases on top of version 1.0 by tagging the + revision from which you prepared the 1.0 release. + + &interaction.branch-repo.tag; + + You can then clone a new shared + myproject-1.0.1 repository as of that + tag. + + &interaction.branch-repo.clone; + + Afterwards, if someone needs to work on a bug fix that ought + to go into an upcoming 1.0.1 minor release, they clone the + myproject-1.0.1 repository, make their + changes, and push them back. + + &interaction.branch-repo.bugfix; + + Meanwhile, development for + the next major release can continue, isolated and unabated, in + the myproject repository. + + &interaction.branch-repo.new; + + + + Don't repeat yourself: merging across branches + + In many cases, if you have a bug to fix on a maintenance + branch, the chances are good that the bug exists on your + project's main branch (and possibly other maintenance branches, + too). It's a rare developer who wants to fix the same bug + multiple times, so let's look at a few ways that Mercurial can + help you to manage these bugfixes without duplicating your + work. + + In the simplest instance, all you need to do is pull changes + from your maintenance branch into your local clone of the target + branch. + + &interaction.branch-repo.pull; + + You'll then need to merge the heads of the two branches, and + push back to the main branch. + + &interaction.branch-repo.merge; + + + + Naming branches within one repository + + In most instances, isolating branches in repositories is the + right approach. Its simplicity makes it easy to understand; and + so it's hard to make mistakes. There's a one-to-one + relationship between branches you're working in and directories + on your system. This lets you use normal (non-Mercurial-aware) + tools to work on files within a branch/repository. + + If you're more in the power user category + (and your collaborators are too), there is + an alternative way of handling branches that you can consider. + I've already mentioned the human-level distinction between + small picture and big picture + branches. While Mercurial works with multiple small + picture branches in a repository all the time (for + example after you pull changes in, but before you merge them), + it can also work with multiple big + picture branches. + + The key to working this way is that Mercurial lets you + assign a persistent name to a branch. + There always exists a branch named default. + Even before you start naming branches yourself, you can find + traces of the default branch if you look for + them. + + As an example, when you run the hg + commit command, and it pops up your editor so that + you can enter a commit message, look for a line that contains + the text HG: branch default at + the bottom. This is telling you that your commit will occur on + the branch named default. + + To start working with named branches, use the hg branches command. This command + lists the named branches already present in your repository, + telling you which changeset is the tip of each. + + &interaction.branch-named.branches; + + Since you haven't created any named branches yet, the only + one that exists is default. + + To find out what the current branch is, run + the hg branch command, giving + it no arguments. This tells you what branch the parent of the + current changeset is on. + + &interaction.branch-named.branch; + + To create a new branch, run the hg + branch command again. This time, give it one + argument: the name of the branch you want to create. + + &interaction.branch-named.create; + + After you've created a branch, you might wonder what effect + the hg branch command has had. + What do the hg status and + hg tip commands report? + + &interaction.branch-named.status; + + Nothing has changed in the + working directory, and there's been no new history created. As + this suggests, running the hg + branch command has no permanent effect; it only + tells Mercurial what branch name to use the + next time you commit a changeset. + + When you commit a change, Mercurial records the name of the + branch on which you committed. Once you've switched from the + default branch to another and committed, + you'll see the name of the new branch show up in the output of + hg log, hg tip, and other commands that + display the same kind of output. + + &interaction.branch-named.commit; + + The hg log-like commands + will print the branch name of every changeset that's not on the + default branch. As a result, if you never + use named branches, you'll never see this information. + + Once you've named a branch and committed a change with that + name, every subsequent commit that descends from that change + will inherit the same branch name. You can change the name of a + branch at any time, using the hg + branch command. + + &interaction.branch-named.rebranch; + + In practice, this is something you won't do very often, as + branch names tend to have fairly long lifetimes. (This isn't a + rule, just an observation.) + + + + Dealing with multiple named branches in a + repository + + If you have more than one named branch in a repository, + Mercurial will remember the branch that your working directory + is on when you start a command like hg + update or hg pull + -u. It will update the working directory to the tip + of this branch, no matter what the repo-wide tip + is. To update to a revision that's on a different named branch, + you may need to use the + option to hg update. + + This behavior is a little subtle, so let's see it in + action. First, let's remind ourselves what branch we're + currently on, and what branches are in our repository. + + &interaction.branch-named.parents; + + We're on the bar branch, but there also + exists an older hg foo + branch. + + We can hg update back and + forth between the tips of the foo and + bar branches without needing to use the + option, because this + only involves going backwards and forwards linearly through our + change history. + + &interaction.branch-named.update-switchy; + + If we go back to the foo branch and then + run hg update, it will keep us + on foo, not move us to the tip of + bar. + + &interaction.branch-named.update-nothing; + + Committing a new change on the foo branch + introduces a new head. + + &interaction.branch-named.foo-commit; + + + + Branch names and merging + + As you've probably noticed, merges in Mercurial are not + symmetrical. Let's say our repository has two heads, 17 and 23. + If I hg update to 17 and then + hg merge with 23, Mercurial + records 17 as the first parent of the merge, and 23 as the + second. Whereas if I hg update + to 23 and then hg merge with + 17, it records 23 as the first parent, and 17 as the + second. + + This affects Mercurial's choice of branch name when you + merge. After a merge, Mercurial will retain the branch name of + the first parent when you commit the result of the merge. If + your first parent's branch name is foo, and + you merge with bar, the branch name will + still be foo after you merge. + + It's not unusual for a repository to contain multiple heads, + each with the same branch name. Let's say I'm working on the + foo branch, and so are you. We commit + different changes; I pull your changes; I now have two heads, + each claiming to be on the foo branch. The + result of a merge will be a single head on the + foo branch, as you might hope. + + But if I'm working on the bar branch, and + I merge work from the foo branch, the result + will remain on the bar branch. + + &interaction.branch-named.merge; + + To give a more concrete example, if I'm working on the + bleeding-edge branch, and I want to bring in + the latest fixes from the stable branch, + Mercurial will choose the right + (bleeding-edge) branch name when I pull and + merge from stable. + + + + Branch naming is generally useful + + You shouldn't think of named branches as applicable only to + situations where you have multiple long-lived branches + cohabiting in a single repository. They're very useful even in + the one-branch-per-repository case. + + In the simplest case, giving a name to each branch gives you + a permanent record of which branch a changeset originated on. + This gives you more context when you're trying to follow the + history of a long-lived branchy project. + + If you're working with shared repositories, you can set up a + pretxnchangegroup hook on each + that will block incoming changes that have the + wrong branch name. This provides a simple, but + effective, defence against people accidentally pushing changes + from a bleeding edge branch to a + stable branch. Such a hook might look like this + inside the shared repo's + /.hgrc. + [hooks] +pretxnchangegroup.branch = hg heads --template '{branches} ' | grep mybranch + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch08-undo.xml --- a/en/ch08-undo.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1069 +0,0 @@ - - - - - Finding and fixing mistakes - - To err might be human, but to really handle the consequences - well takes a top-notch revision control system. In this chapter, - we'll discuss some of the techniques you can use when you find - that a problem has crept into your project. Mercurial has some - highly capable features that will help you to isolate the sources - of problems, and to handle them appropriately. - - - Erasing local history - - - The accidental commit - - I have the occasional but persistent problem of typing - rather more quickly than I can think, which sometimes results - in me committing a changeset that is either incomplete or - plain wrong. In my case, the usual kind of incomplete - changeset is one in which I've created a new source file, but - forgotten to hg add it. A - plain wrong changeset is not as common, but no - less annoying. - - - - Rolling back a transaction - - In , I - mentioned that Mercurial treats each modification of a - repository as a transaction. Every time - you commit a changeset or pull changes from another - repository, Mercurial remembers what you did. You can undo, - or roll back, exactly one of these - actions using the hg rollback - command. (See - for an important caveat about the use of this command.) - - Here's a mistake that I often find myself making: - committing a change in which I've created a new file, but - forgotten to hg add - it. - - &interaction.rollback.commit; - - Looking at the output of hg - status after the commit immediately confirms the - error. - - &interaction.rollback.status; - - The commit captured the changes to the file - a, but not the new file - b. If I were to push this changeset to a - repository that I shared with a colleague, the chances are - high that something in a would refer to - b, which would not be present in their - repository when they pulled my changes. I would thus become - the object of some indignation. - - However, luck is with me&emdash;I've caught my error - before I pushed the changeset. I use the hg rollback command, and Mercurial - makes that last changeset vanish. - - &interaction.rollback.rollback; - - Notice that the changeset is no longer present in the - repository's history, and the working directory once again - thinks that the file a is modified. The - commit and rollback have left the working directory exactly as - it was prior to the commit; the changeset has been completely - erased. I can now safely hg - add the file b, and rerun my - commit. - - &interaction.rollback.add; - - - - The erroneous pull - - It's common practice with Mercurial to maintain separate - development branches of a project in different repositories. - Your development team might have one shared repository for - your project's 0.9 release, and another, - containing different changes, for the 1.0 - release. - - Given this, you can imagine that the consequences could be - messy if you had a local 0.9 repository, and - accidentally pulled changes from the shared 1.0 - repository into it. At worst, you could be paying - insufficient attention, and push those changes into the shared - 0.9 tree, confusing your entire team (but don't - worry, we'll return to this horror scenario later). However, - it's more likely that you'll notice immediately, because - Mercurial will display the URL it's pulling from, or you will - see it pull a suspiciously large number of changes into the - repository. - - The hg rollback command - will work nicely to expunge all of the changesets that you - just pulled. Mercurial groups all changes from one hg pull into a single transaction, - so one hg rollback is all you - need to undo this mistake. - - - - Rolling back is useless once you've pushed - - The value of the hg - rollback command drops to zero once you've pushed - your changes to another repository. Rolling back a change - makes it disappear entirely, but only in - the repository in which you perform the hg rollback. Because a rollback - eliminates history, there's no way for the disappearance of a - change to propagate between repositories. - - If you've pushed a change to another - repository&emdash;particularly if it's a shared - repository&emdash;it has essentially escaped into the - wild, and you'll have to recover from your mistake - in a different way. What will happen if you push a changeset - somewhere, then roll it back, then pull from the repository - you pushed to, is that the changeset will reappear in your - repository. - - (If you absolutely know for sure that the change you want - to roll back is the most recent change in the repository that - you pushed to, and you know that nobody - else could have pulled it from that repository, you can roll - back the changeset there, too, but you really should really - not rely on this working reliably. If you do this, sooner or - later a change really will make it into a repository that you - don't directly control (or have forgotten about), and come - back to bite you.) - - - - You can only roll back once - - Mercurial stores exactly one transaction in its - transaction log; that transaction is the most recent one that - occurred in the repository. This means that you can only roll - back one transaction. If you expect to be able to roll back - one transaction, then its predecessor, this is not the - behavior you will get. - - &interaction.rollback.twice; - - Once you've rolled back one transaction in a repository, - you can't roll back again in that repository until you perform - another commit or pull. - - - - - Reverting the mistaken change - - If you make a modification to a file, and decide that you - really didn't want to change the file at all, and you haven't - yet committed your changes, the hg - revert command is the one you'll need. It looks at - the changeset that's the parent of the working directory, and - restores the contents of the file to their state as of that - changeset. (That's a long-winded way of saying that, in the - normal case, it undoes your modifications.) - - Let's illustrate how the hg - revert command works with yet another small example. - We'll begin by modifying a file that Mercurial is already - tracking. - - &interaction.daily.revert.modify; - - If we don't - want that change, we can simply hg - revert the file. - - &interaction.daily.revert.unmodify; - - The hg revert command - provides us with an extra degree of safety by saving our - modified file with a .orig - extension. - - &interaction.daily.revert.status; - - Here is a summary of the cases that the hg revert command can deal with. We - will describe each of these in more detail in the section that - follows. - - If you modify a file, it will restore the file - to its unmodified state. - - If you hg add a - file, it will undo the added state of the - file, but leave the file itself untouched. - - If you delete a file without telling Mercurial, - it will restore the file to its unmodified contents. - - If you use the hg - remove command to remove a file, it will undo - the removed state of the file, and restore - the file to its unmodified contents. - - - - File management errors - - The hg revert command is - useful for more than just modified files. It lets you reverse - the results of all of Mercurial's file management - commands&emdash;hg add, - hg remove, and so on. - - If you hg add a file, - then decide that in fact you don't want Mercurial to track it, - use hg revert to undo the - add. Don't worry; Mercurial will not modify the file in any - way. It will just unmark the file. - - &interaction.daily.revert.add; - - Similarly, if you ask Mercurial to hg remove a file, you can use - hg revert to restore it to - the contents it had as of the parent of the working directory. - &interaction.daily.revert.remove; This works just as - well for a file that you deleted by hand, without telling - Mercurial (recall that in Mercurial terminology, this kind of - file is called missing). - - &interaction.daily.revert.missing; - - If you revert a hg copy, - the copied-to file remains in your working directory - afterwards, untracked. Since a copy doesn't affect the - copied-from file in any way, Mercurial doesn't do anything - with the copied-from file. - - &interaction.daily.revert.copy; - - - A slightly special case: reverting a rename - - If you hg rename a - file, there is one small detail that you should remember. - When you hg revert a - rename, it's not enough to provide the name of the - renamed-to file, as you can see here. - - &interaction.daily.revert.rename; - - As you can see from the output of hg status, the renamed-to file is - no longer identified as added, but the - renamed-from file is still removed! - This is counter-intuitive (at least to me), but at least - it's easy to deal with. - - &interaction.daily.revert.rename-orig; - - So remember, to revert a hg - rename, you must provide - both the source and destination - names. - - % TODO: the output doesn't look like it will be - removed! - - (By the way, if you rename a file, then modify the - renamed-to file, then revert both components of the rename, - when Mercurial restores the file that was removed as part of - the rename, it will be unmodified. If you need the - modifications in the renamed-to file to show up in the - renamed-from file, don't forget to copy them over.) - - These fiddly aspects of reverting a rename arguably - constitute a small bug in Mercurial. - - - - - - Dealing with committed changes - - Consider a case where you have committed a change $a$, and - another change $b$ on top of it; you then realise that change - $a$ was incorrect. Mercurial lets you back out - an entire changeset automatically, and building blocks that let - you reverse part of a changeset by hand. - - Before you read this section, here's something to - keep in mind: the hg backout - command undoes changes by adding history, - not by modifying or erasing it. It's the right tool to use if - you're fixing bugs, but not if you're trying to undo some change - that has catastrophic consequences. To deal with those, see - . - - - Backing out a changeset - - The hg backout command - lets you undo the effects of an entire - changeset in an automated fashion. Because Mercurial's - history is immutable, this command does - not get rid of the changeset you want to undo. - Instead, it creates a new changeset that - reverses the effect of the to-be-undone - changeset. - - The operation of the hg - backout command is a little intricate, so let's - illustrate it with some examples. First, we'll create a - repository with some simple changes. - - &interaction.backout.init; - - The hg backout command - takes a single changeset ID as its argument; this is the - changeset to back out. Normally, hg - backout will drop you into a text editor to write - a commit message, so you can record why you're backing the - change out. In this example, we provide a commit message on - the command line using the option. - - - - Backing out the tip changeset - - We're going to start by backing out the last changeset we - committed. - - &interaction.backout.simple; - - You can see that the second line from - myfile is no longer present. Taking a - look at the output of hg log - gives us an idea of what the hg - backout command has done. - &interaction.backout.simple.log; Notice that the new changeset - that hg backout has created - is a child of the changeset we backed out. It's easier to see - this in , which presents a - graphical view of the change history. As you can see, the - history is nice and linear. - -
- Backing out a change using the <command - role="hg-cmd">hg backout</command> command - - - XXX add text - -
- -
- - Backing out a non-tip change - - If you want to back out a change other than the last one - you committed, pass the option to the - hg backout command. - - &interaction.backout.non-tip.clone; - - This makes backing out any changeset a - one-shot operation that's usually simple and - fast. - - &interaction.backout.non-tip.backout; - - If you take a look at the contents of - myfile after the backout finishes, you'll - see that the first and third changes are present, but not the - second. - - &interaction.backout.non-tip.cat; - - As the graphical history in illustrates, Mercurial - actually commits two changes in this kind - of situation (the box-shaped nodes are the ones that Mercurial - commits automatically). Before Mercurial begins the backout - process, it first remembers what the current parent of the - working directory is. It then backs out the target changeset, - and commits that as a changeset. Finally, it merges back to - the previous parent of the working directory, and commits the - result of the merge. - - % TODO: to me it looks like mercurial doesn't commit the - second merge automatically! - -
- Automated backout of a non-tip change using the - <command role="hg-cmd">hg backout</command> command - - - XXX add text - -
- - The result is that you end up back where you - were, only with some extra history that undoes the - effect of the changeset you wanted to back out. - - - Always use the <option - role="hg-opt-backout">--merge</option> option - - In fact, since the option will do the - right thing whether or not the changeset - you're backing out is the tip (i.e. it won't try to merge if - it's backing out the tip, since there's no need), you should - always use this option when you run the - hg backout command. - - -
- - Gaining more control of the backout process - - While I've recommended that you always use the option when backing - out a change, the hg backout - command lets you decide how to merge a backout changeset. - Taking control of the backout process by hand is something you - will rarely need to do, but it can be useful to understand - what the hg backout command - is doing for you automatically. To illustrate this, let's - clone our first repository, but omit the backout change that - it contains. - - &interaction.backout.manual.clone; - - As with our - earlier example, We'll commit a third changeset, then back out - its parent, and see what happens. - - &interaction.backout.manual.backout; - - Our new changeset is again a descendant of the changeset - we backout out; it's thus a new head, not - a descendant of the changeset that was the tip. The hg backout command was quite - explicit in telling us this. - - &interaction.backout.manual.log; - - Again, it's easier to see what has happened by looking at - a graph of the revision history, in . This makes it clear - that when we use hg backout - to back out a change other than the tip, Mercurial adds a new - head to the repository (the change it committed is - box-shaped). - -
- Backing out a change using the <command - role="hg-cmd">hg backout</command> command - - - XXX add text - -
- - After the hg backout - command has completed, it leaves the new - backout changeset as the parent of the working - directory. - - &interaction.backout.manual.parents; - - Now we have two isolated sets of changes. - - &interaction.backout.manual.heads; - - Let's think about what we expect to see as the contents of - myfile now. The first change should be - present, because we've never backed it out. The second change - should be missing, as that's the change we backed out. Since - the history graph shows the third change as a separate head, - we don't expect to see the third change - present in myfile. - - &interaction.backout.manual.cat; - - To get the third change back into the file, we just do a - normal merge of our two heads. - - &interaction.backout.manual.merge; - - Afterwards, the graphical history of our - repository looks like - . - -
- Manually merging a backout change - - - XXX add text - -
- -
- - Why <command role="hg-cmd">hg backout</command> works as - it does - - Here's a brief description of how the hg backout command works. - - It ensures that the working directory is - clean, i.e. that the output of hg status would be empty. - - It remembers the current parent of the working - directory. Let's call this changeset - orig - - It does the equivalent of a hg update to sync the working - directory to the changeset you want to back out. Let's - call this changeset backout - - It finds the parent of that changeset. Let's - call that changeset parent. - - For each file that the - backout changeset affected, it does the - equivalent of a hg revert -r - parent on that file, to restore it to the - contents it had before that changeset was - committed. - - It commits the result as a new changeset. - This changeset has backout as its - parent. - - If you specify on the command - line, it merges with orig, and commits - the result of the merge. - - - An alternative way to implement the hg backout command would be to - hg export the - to-be-backed-out changeset as a diff, then use the option to the - patch command to reverse the effect of the - change without fiddling with the working directory. This - sounds much simpler, but it would not work nearly as - well. - - The reason that hg - backout does an update, a commit, a merge, and - another commit is to give the merge machinery the best chance - to do a good job when dealing with all the changes - between the change you're backing out and - the current tip. - - If you're backing out a changeset that's 100 revisions - back in your project's history, the chances that the - patch command will be able to apply a - reverse diff cleanly are not good, because intervening changes - are likely to have broken the context that - patch uses to determine whether it can - apply a patch (if this sounds like gibberish, see for a - discussion of the patch command). Also, - Mercurial's merge machinery will handle files and directories - being renamed, permission changes, and modifications to binary - files, none of which patch can deal - with. - - -
- - Changes that should never have been - - Most of the time, the hg - backout command is exactly what you need if you want - to undo the effects of a change. It leaves a permanent record - of exactly what you did, both when committing the original - changeset and when you cleaned up after it. - - On rare occasions, though, you may find that you've - committed a change that really should not be present in the - repository at all. For example, it would be very unusual, and - usually considered a mistake, to commit a software project's - object files as well as its source files. Object files have - almost no intrinsic value, and they're big, - so they increase the size of the repository and the amount of - time it takes to clone or pull changes. - - Before I discuss the options that you have if you commit a - brown paper bag change (the kind that's so bad - that you want to pull a brown paper bag over your head), let me - first discuss some approaches that probably won't work. - - Since Mercurial treats history as - accumulative&emdash;every change builds on top of all changes - that preceded it&emdash;you generally can't just make disastrous - changes disappear. The one exception is when you've just - committed a change, and it hasn't been pushed or pulled into - another repository. That's when you can safely use the hg rollback command, as I detailed in - . - - After you've pushed a bad change to another repository, you - could still use hg - rollback to make your local copy of the change - disappear, but it won't have the consequences you want. The - change will still be present in the remote repository, so it - will reappear in your local repository the next time you - pull. - - If a situation like this arises, and you know which - repositories your bad change has propagated into, you can - try to get rid of the changeefrom - every one of those repositories. This is, - of course, not a satisfactory solution: if you miss even a - single repository while you're expunging, the change is still - in the wild, and could propagate further. - - If you've committed one or more changes - after the change that you'd like to see - disappear, your options are further reduced. Mercurial doesn't - provide a way to punch a hole in history, leaving - changesets intact. - - XXX This needs filling out. The - hg-replay script in the - examples directory works, but doesn't handle - merge changesets. Kind of an important omission. - - - Protect yourself from <quote>escaped</quote> - changes - - If you've committed some changes to your local repository - and they've been pushed or pulled somewhere else, this isn't - necessarily a disaster. You can protect yourself ahead of - time against some classes of bad changeset. This is - particularly easy if your team usually pulls changes from a - central repository. - - By configuring some hooks on that repository to validate - incoming changesets (see chapter ), - you can - automatically prevent some kinds of bad changeset from being - pushed to the central repository at all. With such a - configuration in place, some kinds of bad changeset will - naturally tend to die out because they can't - propagate into the central repository. Better yet, this - happens without any need for explicit intervention. - - For instance, an incoming change hook that verifies that a - changeset will actually compile can prevent people from - inadvertantly breaking the build. - - - - - Finding the source of a bug - - While it's all very well to be able to back out a changeset - that introduced a bug, this requires that you know which - changeset to back out. Mercurial provides an invaluable - command, called hg bisect, that - helps you to automate this process and accomplish it very - efficiently. - - The idea behind the hg - bisect command is that a changeset has introduced - some change of behavior that you can identify with a simple - binary test. You don't know which piece of code introduced the - change, but you know how to test for the presence of the bug. - The hg bisect command uses your - test to direct its search for the changeset that introduced the - code that caused the bug. - - Here are a few scenarios to help you understand how you - might apply this command. - - The most recent version of your software has a - bug that you remember wasn't present a few weeks ago, but - you don't know when it was introduced. Here, your binary - test checks for the presence of that bug. - - You fixed a bug in a rush, and now it's time to - close the entry in your team's bug database. The bug - database requires a changeset ID when you close an entry, - but you don't remember which changeset you fixed the bug in. - Once again, your binary test checks for the presence of the - bug. - - Your software works correctly, but runs 15% - slower than the last time you measured it. You want to know - which changeset introduced the performance regression. In - this case, your binary test measures the performance of your - software, to see whether it's fast or - slow. - - The sizes of the components of your project that - you ship exploded recently, and you suspect that something - changed in the way you build your project. - - - From these examples, it should be clear that the hg bisect command is not useful only - for finding the sources of bugs. You can use it to find any - emergent property of a repository (anything that - you can't find from a simple text search of the files in the - tree) for which you can write a binary test. - - We'll introduce a little bit of terminology here, just to - make it clear which parts of the search process are your - responsibility, and which are Mercurial's. A - test is something that - you run when hg - bisect chooses a changeset. A - probe is what hg - bisect runs to tell whether a revision is good. - Finally, we'll use the word bisect, as both a - noun and a verb, to stand in for the phrase search using - the hg bisect - command. - - One simple way to automate the searching process would be - simply to probe every changeset. However, this scales poorly. - If it took ten minutes to test a single changeset, and you had - 10,000 changesets in your repository, the exhaustive approach - would take on average 35 days to find the - changeset that introduced a bug. Even if you knew that the bug - was introduced by one of the last 500 changesets, and limited - your search to those, you'd still be looking at over 40 hours to - find the changeset that introduced your bug. - - What the hg bisect command - does is use its knowledge of the shape of your - project's revision history to perform a search in time - proportional to the logarithm of the number - of changesets to check (the kind of search it performs is called - a dichotomic search). With this approach, searching through - 10,000 changesets will take less than three hours, even at ten - minutes per test (the search will require about 14 tests). - Limit your search to the last hundred changesets, and it will - take only about an hour (roughly seven tests). - - The hg bisect command is - aware of the branchy nature of a Mercurial - project's revision history, so it has no problems dealing with - branches, merges, or multiple heads in a repository. It can - prune entire branches of history with a single probe, which is - how it operates so efficiently. - - - Using the <command role="hg-cmd">hg bisect</command> - command - - Here's an example of hg - bisect in action. - - - In versions 0.9.5 and earlier of Mercurial, hg bisect was not a core command: - it was distributed with Mercurial as an extension. This - section describes the built-in command, not the old - extension. - - - Now let's create a repository, so that we can try out the - hg bisect command in - isolation. - - &interaction.bisect.init; - - We'll simulate a project that has a bug in it in a - simple-minded way: create trivial changes in a loop, and - nominate one specific change that will have the - bug. This loop creates 35 changesets, each - adding a single file to the repository. We'll represent our - bug with a file that contains the text i - have a gub. - - &interaction.bisect.commits; - - The next thing that we'd like to do is figure out how to - use the hg bisect command. - We can use Mercurial's normal built-in help mechanism for - this. - - &interaction.bisect.help; - - The hg bisect command - works in steps. Each step proceeds as follows. - - You run your binary test. - - If the test succeeded, you tell hg bisect by running the - hg bisect good - command. - - If it failed, run the hg bisect bad - command. - - The command uses your information to decide - which changeset to test next. - - It updates the working directory to that - changeset, and the process begins again. - - The process ends when hg - bisect identifies a unique changeset that marks - the point where your test transitioned from - succeeding to failing. - - To start the search, we must run the hg bisect --reset command. - - &interaction.bisect.search.init; - - In our case, the binary test we use is simple: we check to - see if any file in the repository contains the string i - have a gub. If it does, this changeset contains the - change that caused the bug. By convention, a - changeset that has the property we're searching for is - bad, while one that doesn't is - good. - - Most of the time, the revision to which the working - directory is synced (usually the tip) already exhibits the - problem introduced by the buggy change, so we'll mark it as - bad. - - &interaction.bisect.search.bad-init; - - Our next task is to nominate a changeset that we know - doesn't have the bug; the hg bisect command will - bracket its search between the first pair of - good and bad changesets. In our case, we know that revision - 10 didn't have the bug. (I'll have more words about choosing - the first good changeset later.) - - &interaction.bisect.search.good-init; - - Notice that this command printed some output. - - It told us how many changesets it must - consider before it can identify the one that introduced - the bug, and how many tests that will require. - - It updated the working directory to the next - changeset to test, and told us which changeset it's - testing. - - - We now run our test in the working directory. We use the - grep command to see if our - bad file is present in the working directory. - If it is, this revision is bad; if not, this revision is good. - &interaction.bisect.search.step1; - - This test looks like a perfect candidate for automation, - so let's turn it into a shell function. - &interaction.bisect.search.mytest; - - We can now run an entire test step with a single command, - mytest. - - &interaction.bisect.search.step2; - - A few more invocations of our canned test step command, - and we're done. - - &interaction.bisect.search.rest; - - Even though we had 40 changesets to search through, the - hg bisect command let us find - the changeset that introduced our bug with only - five tests. Because the number of tests that the hg bisect command performs grows - logarithmically with the number of changesets to search, the - advantage that it has over the brute force - search approach increases with every changeset you add. - - - - Cleaning up after your search - - When you're finished using the hg - bisect command in a repository, you can use the - hg bisect reset command to - drop the information it was using to drive your search. The - command doesn't use much space, so it doesn't matter if you - forget to run this command. However, hg bisect won't let you start a new - search in that repository until you do a hg bisect reset. - - &interaction.bisect.search.reset; - - - - - Tips for finding bugs effectively - - - Give consistent input - - The hg bisect command - requires that you correctly report the result of every test - you perform. If you tell it that a test failed when it really - succeeded, it might be able to detect the - inconsistency. If it can identify an inconsistency in your - reports, it will tell you that a particular changeset is both - good and bad. However, it can't do this perfectly; it's about - as likely to report the wrong changeset as the source of the - bug. - - - - Automate as much as possible - - When I started using the hg - bisect command, I tried a few times to run my - tests by hand, on the command line. This is an approach that - I, at least, am not suited to. After a few tries, I found - that I was making enough mistakes that I was having to restart - my searches several times before finally getting correct - results. - - My initial problems with driving the hg bisect command by hand occurred - even with simple searches on small repositories; if the - problem you're looking for is more subtle, or the number of - tests that hg bisect must - perform increases, the likelihood of operator error ruining - the search is much higher. Once I started automating my - tests, I had much better results. - - The key to automated testing is twofold: - - always test for the same symptom, and - - always feed consistent input to the hg bisect command. - - In my tutorial example above, the grep - command tests for the symptom, and the if - statement takes the result of this check and ensures that we - always feed the same input to the hg - bisect command. The mytest - function marries these together in a reproducible way, so that - every test is uniform and consistent. - - - - Check your results - - Because the output of a hg - bisect search is only as good as the input you - give it, don't take the changeset it reports as the absolute - truth. A simple way to cross-check its report is to manually - run your test at each of the following changesets: - - The changeset that it reports as the first bad - revision. Your test should still report this as - bad. - - The parent of that changeset (either parent, - if it's a merge). Your test should report this changeset - as good. - - A child of that changeset. Your test should - report this changeset as bad. - - - - - Beware interference between bugs - - It's possible that your search for one bug could be - disrupted by the presence of another. For example, let's say - your software crashes at revision 100, and worked correctly at - revision 50. Unknown to you, someone else introduced a - different crashing bug at revision 60, and fixed it at - revision 80. This could distort your results in one of - several ways. - - It is possible that this other bug completely - masks yours, which is to say that it occurs - before your bug has a chance to manifest itself. If you can't - avoid that other bug (for example, it prevents your project - from building), and so can't tell whether your bug is present - in a particular changeset, the hg - bisect command cannot help you directly. Instead, - you can mark a changeset as untested by running hg bisect --skip. - - A different problem could arise if your test for a bug's - presence is not specific enough. If you check for my - program crashes, then both your crashing bug and an - unrelated crashing bug that masks it will look like the same - thing, and mislead hg - bisect. - - Another useful situation in which to use hg bisect --skip is if you can't - test a revision because your project was in a broken and hence - untestable state at that revision, perhaps because someone - checked in a change that prevented the project from - building. - - - - Bracket your search lazily - - Choosing the first good and - bad changesets that will mark the end points of - your search is often easy, but it bears a little discussion - nevertheless. From the perspective of hg bisect, the newest - changeset is conventionally bad, and the older - changeset is good. - - If you're having trouble remembering when a suitable - good change was, so that you can tell hg bisect, you could do worse than - testing changesets at random. Just remember to eliminate - contenders that can't possibly exhibit the bug (perhaps - because the feature with the bug isn't present yet) and those - where another problem masks the bug (as I discussed - above). - - Even if you end up early by thousands of - changesets or months of history, you will only add a handful - of tests to the total number that hg - bisect must perform, thanks to its logarithmic - behavior. - - - -
- - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch09-hook.xml --- a/en/ch09-hook.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,2038 +0,0 @@ - - - - - Handling repository events with hooks - - Mercurial offers a powerful mechanism to let you perform - automated actions in response to events that occur in a - repository. In some cases, you can even control Mercurial's - response to those events. - - The name Mercurial uses for one of these actions is a - hook. Hooks are called - triggers in some revision control systems, but the - two names refer to the same idea. - - - An overview of hooks in Mercurial - - Here is a brief list of the hooks that Mercurial - supports. We will revisit each of these hooks in more detail - later, in . - - - changegroup: This - is run after a group of changesets has been brought into the - repository from elsewhere. - - commit: This is - run after a new changeset has been created in the local - repository. - - incoming: This is - run once for each new changeset that is brought into the - repository from elsewhere. Notice the difference from - changegroup, which is run - once per group of changesets brought - in. - - outgoing: This is - run after a group of changesets has been transmitted from - this repository. - - prechangegroup: - This is run before starting to bring a group of changesets - into the repository. - - - precommit: - Controlling. This is run before starting a commit. - - - preoutgoing: - Controlling. This is run before starting to transmit a group - of changesets from this repository. - - - pretag: - Controlling. This is run before creating a tag. - - - pretxnchangegroup: Controlling. This - is run after a group of changesets has been brought into the - local repository from another, but before the transaction - completes that will make the changes permanent in the - repository. - - - pretxncommit: - Controlling. This is run after a new changeset has been - created in the local repository, but before the transaction - completes that will make it permanent. - - - preupdate: - Controlling. This is run before starting an update or merge - of the working directory. - - - tag: This is run - after a tag is created. - - - update: This is - run after an update or merge of the working directory has - finished. - - - Each of the hooks whose description begins with the word - Controlling has the ability to determine whether - an activity can proceed. If the hook succeeds, the activity may - proceed; if it fails, the activity is either not permitted or - undone, depending on the hook. - - - - - Hooks and security - - - Hooks are run with your privileges - - When you run a Mercurial command in a repository, and the - command causes a hook to run, that hook runs on - your system, under - your user account, with - your privilege level. Since hooks are - arbitrary pieces of executable code, you should treat them - with an appropriate level of suspicion. Do not install a hook - unless you are confident that you know who created it and what - it does. - - - In some cases, you may be exposed to hooks that you did - not install yourself. If you work with Mercurial on an - unfamiliar system, Mercurial will run hooks defined in that - system's global ~/.hgrc - file. - - - If you are working with a repository owned by another - user, Mercurial can run hooks defined in that user's - repository, but it will still run them as you. - For example, if you hg pull - from that repository, and its .hg/hgrc defines a local outgoing hook, that hook will run - under your user account, even though you don't own that - repository. - - - - This only applies if you are pulling from a repository - on a local or network filesystem. If you're pulling over - http or ssh, any outgoing - hook will run under whatever account is executing the server - process, on the server. - - - - XXX To see what hooks are defined in a repository, use the - hg config hooks command. If - you are working in one repository, but talking to another that - you do not own (e.g. using hg - pull or hg - incoming), remember that it is the other - repository's hooks you should be checking, not your own. - - - - - Hooks do not propagate - - In Mercurial, hooks are not revision controlled, and do - not propagate when you clone, or pull from, a repository. The - reason for this is simple: a hook is a completely arbitrary - piece of executable code. It runs under your user identity, - with your privilege level, on your machine. - - - It would be extremely reckless for any distributed - revision control system to implement revision-controlled - hooks, as this would offer an easily exploitable way to - subvert the accounts of users of the revision control system. - - - Since Mercurial does not propagate hooks, if you are - collaborating with other people on a common project, you - should not assume that they are using the same Mercurial hooks - as you are, or that theirs are correctly configured. You - should document the hooks you expect people to use. - - - In a corporate intranet, this is somewhat easier to - control, as you can for example provide a - standard installation of Mercurial on an NFS - filesystem, and use a site-wide ~/.hgrc file to define hooks that all users will - see. However, this too has its limits; see below. - - - - - Hooks can be overridden - - Mercurial allows you to override a hook definition by - redefining the hook. You can disable it by setting its value - to the empty string, or change its behavior as you wish. - - - If you deploy a system- or site-wide ~/.hgrc file that defines some - hooks, you should thus understand that your users can disable - or override those hooks. - - - - - Ensuring that critical hooks are run - - Sometimes you may want to enforce a policy that you do not - want others to be able to work around. For example, you may - have a requirement that every changeset must pass a rigorous - set of tests. Defining this requirement via a hook in a - site-wide ~/.hgrc won't - work for remote users on laptops, and of course local users - can subvert it at will by overriding the hook. - - - Instead, you can set up your policies for use of Mercurial - so that people are expected to propagate changes through a - well-known canonical server that you have - locked down and configured appropriately. - - - One way to do this is via a combination of social - engineering and technology. Set up a restricted-access - account; users can push changes over the network to - repositories managed by this account, but they cannot log into - the account and run normal shell commands. In this scenario, - a user can commit a changeset that contains any old garbage - they want. - - - When someone pushes a changeset to the server that - everyone pulls from, the server will test the changeset before - it accepts it as permanent, and reject it if it fails to pass - the test suite. If people only pull changes from this - filtering server, it will serve to ensure that all changes - that people pull have been automatically vetted. - - - - - - Care with <literal>pretxn</literal> hooks in a - shared-access repository - - If you want to use hooks to do some automated work in a - repository that a number of people have shared access to, you - need to be careful in how you do this. - - - Mercurial only locks a repository when it is writing to the - repository, and only the parts of Mercurial that write to the - repository pay attention to locks. Write locks are necessary to - prevent multiple simultaneous writers from scribbling on each - other's work, corrupting the repository. - - - Because Mercurial is careful with the order in which it - reads and writes data, it does not need to acquire a lock when - it wants to read data from the repository. The parts of - Mercurial that read from the repository never pay attention to - locks. This lockless reading scheme greatly increases - performance and concurrency. - - - With great performance comes a trade-off, though, one which - has the potential to cause you trouble unless you're aware of - it. To describe this requires a little detail about how - Mercurial adds changesets to a repository and reads those - changes. - - - When Mercurial writes metadata, it - writes it straight into the destination file. It writes file - data first, then manifest data (which contains pointers to the - new file data), then changelog data (which contains pointers to - the new manifest data). Before the first write to each file, it - stores a record of where the end of the file was in its - transaction log. If the transaction must be rolled back, - Mercurial simply truncates each file back to the size it was - before the transaction began. - - - When Mercurial reads metadata, it reads - the changelog first, then everything else. Since a reader will - only access parts of the manifest or file metadata that it can - see in the changelog, it can never see partially written data. - - - Some controlling hooks (pretxncommit and pretxnchangegroup) run when a - transaction is almost complete. All of the metadata has been - written, but Mercurial can still roll the transaction back and - cause the newly-written data to disappear. - - - If one of these hooks runs for long, it opens a window of - time during which a reader can see the metadata for changesets - that are not yet permanent, and should not be thought of as - really there. The longer the hook runs, the - longer that window is open. - - - - The problem illustrated - - In principle, a good use for the pretxnchangegroup hook would be to - automatically build and test incoming changes before they are - accepted into a central repository. This could let you - guarantee that nobody can push changes to this repository that - break the build. But if a client can pull - changes while they're being tested, the usefulness of the test - is zero; an unsuspecting someone can pull untested changes, - potentially breaking their build. - - - The safest technological answer to this challenge is to - set up such a gatekeeper repository as - unidirectional. Let it take changes - pushed in from the outside, but do not allow anyone to pull - changes from it (use the preoutgoing hook to lock it down). - Configure a changegroup hook so - that if a build or test succeeds, the hook will push the new - changes out to another repository that people - can pull from. - - - In practice, putting a centralised bottleneck like this in - place is not often a good idea, and transaction visibility has - nothing to do with the problem. As the size of a - project&emdash;and the time it takes to build and - test&emdash;grows, you rapidly run into a wall with this - try before you buy approach, where you have - more changesets to test than time in which to deal with them. - The inevitable result is frustration on the part of all - involved. - - - An approach that scales better is to get people to build - and test before they push, then run automated builds and tests - centrally after a push, to be sure all is - well. The advantage of this approach is that it does not - impose a limit on the rate at which the repository can accept - changes. - - - - - - A short tutorial on using hooks - - It is easy to write a Mercurial hook. Let's start with a - hook that runs when you finish a hg - commit, and simply prints the hash of the changeset - you just created. The hook is called commit. - - - All hooks follow the pattern in this example. - -&interaction.hook.simple.init; - - You add an entry to the hooks section of your ~/.hgrc. On the left is the name of - the event to trigger on; on the right is the action to take. As - you can see, you can run an arbitrary shell command in a hook. - Mercurial passes extra information to the hook using environment - variables (look for HG_NODE in the example). - - - - Performing multiple actions per event - - Quite often, you will want to define more than one hook - for a particular kind of event, as shown below. - -&interaction.hook.simple.ext; - - Mercurial lets you do this by adding an - extension to the end of a hook's name. - You extend a hook's name by giving the name of the hook, - followed by a full stop (the - . character), followed by - some more text of your choosing. For example, Mercurial will - run both commit.foo and - commit.bar when the - commit event occurs. - - - To give a well-defined order of execution when there are - multiple hooks defined for an event, Mercurial sorts hooks by - extension, and executes the hook commands in this sorted - order. In the above example, it will execute - commit.bar before - commit.foo, and commit - before both. - - - It is a good idea to use a somewhat descriptive - extension when you define a new hook. This will help you to - remember what the hook was for. If the hook fails, you'll get - an error message that contains the hook name and extension, so - using a descriptive extension could give you an immediate hint - as to why the hook failed (see for an example). - - - - - Controlling whether an activity can proceed - - In our earlier examples, we used the commit hook, which is run after a - commit has completed. This is one of several Mercurial hooks - that run after an activity finishes. Such hooks have no way - of influencing the activity itself. - - - Mercurial defines a number of events that occur before an - activity starts; or after it starts, but before it finishes. - Hooks that trigger on these events have the added ability to - choose whether the activity can continue, or will abort. - - - The pretxncommit hook runs - after a commit has all but completed. In other words, the - metadata representing the changeset has been written out to - disk, but the transaction has not yet been allowed to - complete. The pretxncommit - hook has the ability to decide whether the transaction can - complete, or must be rolled back. - - - If the pretxncommit hook - exits with a status code of zero, the transaction is allowed - to complete; the commit finishes; and the commit hook is run. If the pretxncommit hook exits with a - non-zero status code, the transaction is rolled back; the - metadata representing the changeset is erased; and the - commit hook is not run. - - -&interaction.hook.simple.pretxncommit; - - The hook in the example above checks that a commit comment - contains a bug ID. If it does, the commit can complete. If - not, the commit is rolled back. - - - - - - Writing your own hooks - - When you are writing a hook, you might find it useful to run - Mercurial either with the option, or the verbose config item set to - true. When you do so, Mercurial will print a - message before it calls each hook. - - - - Choosing how your hook should run - - You can write a hook either as a normal - program&emdash;typically a shell script&emdash;or as a Python - function that is executed within the Mercurial process. - - - Writing a hook as an external program has the advantage - that it requires no knowledge of Mercurial's internals. You - can call normal Mercurial commands to get any added - information you need. The trade-off is that external hooks - are slower than in-process hooks. - - - An in-process Python hook has complete access to the - Mercurial API, and does not shell out to - another process, so it is inherently faster than an external - hook. It is also easier to obtain much of the information - that a hook requires by using the Mercurial API than by - running Mercurial commands. - - - If you are comfortable with Python, or require high - performance, writing your hooks in Python may be a good - choice. However, when you have a straightforward hook to - write and you don't need to care about performance (probably - the majority of hooks), a shell script is perfectly fine. - - - - - Hook parameters - - Mercurial calls each hook with a set of well-defined - parameters. In Python, a parameter is passed as a keyword - argument to your hook function. For an external program, a - parameter is passed as an environment variable. - - - Whether your hook is written in Python or as a shell - script, the hook-specific parameter names and values will be - the same. A boolean parameter will be represented as a - boolean value in Python, but as the number 1 (for - true) or 0 (for false) as an - environment variable for an external hook. If a hook - parameter is named foo, the keyword - argument for a Python hook will also be named - foo, while the environment variable for an - external hook will be named HG_FOO. - - - - - Hook return values and activity control - - A hook that executes successfully must exit with a status - of zero if external, or return boolean false if - in-process. Failure is indicated with a non-zero exit status - from an external hook, or an in-process hook returning boolean - true. If an in-process hook raises an - exception, the hook is considered to have failed. - - - For a hook that controls whether an activity can proceed, - zero/false means allow, while - non-zero/true/exception means deny. - - - - - Writing an external hook - - When you define an external hook in your ~/.hgrc and the hook is run, its - value is passed to your shell, which interprets it. This - means that you can use normal shell constructs in the body of - the hook. - - - An executable hook is always run with its current - directory set to a repository's root directory. - - - Each hook parameter is passed in as an environment - variable; the name is upper-cased, and prefixed with the - string HG_. - - - With the exception of hook parameters, Mercurial does not - set or modify any environment variables when running a hook. - This is useful to remember if you are writing a site-wide hook - that may be run by a number of different users with differing - environment variables set. In multi-user situations, you - should not rely on environment variables being set to the - values you have in your environment when testing the hook. - - - - - Telling Mercurial to use an in-process hook - - The ~/.hgrc syntax - for defining an in-process hook is slightly different than for - an executable hook. The value of the hook must start with the - text python:, and continue - with the fully-qualified name of a callable object to use as - the hook's value. - - - The module in which a hook lives is automatically imported - when a hook is run. So long as you have the module name and - PYTHONPATH right, it should just - work. - - - The following ~/.hgrc - example snippet illustrates the syntax and meaning of the - notions we just described. - - [hooks] -commit.example = python:mymodule.submodule.myhook - When Mercurial runs the commit.example - hook, it imports mymodule.submodule, looks - for the callable object named myhook, and - calls it. - - - - - Writing an in-process hook - - The simplest in-process hook does nothing, but illustrates - the basic shape of the hook API: - - def myhook(ui, repo, **kwargs): - pass - The first argument to a Python hook is always a ui object. The second - is a repository object; at the moment, it is always an - instance of localrepository. - Following these two arguments are other keyword arguments. - Which ones are passed in depends on the hook being called, but - a hook can ignore arguments it doesn't care about by dropping - them into a keyword argument dict, as with - **kwargs above. - - - - - - Some hook examples - - - Writing meaningful commit messages - - It's hard to imagine a useful commit message being very - short. The simple pretxncommit - hook of the example below will prevent you from committing a - changeset with a message that is less than ten bytes long. - - -&interaction.hook.msglen.go; - - - - Checking for trailing whitespace - - An interesting use of a commit-related hook is to help you - to write cleaner code. A simple example of cleaner - code is the dictum that a change should not add any - new lines of text that contain trailing - whitespace. Trailing whitespace is a series of - space and tab characters at the end of a line of text. In - most cases, trailing whitespace is unnecessary, invisible - noise, but it is occasionally problematic, and people often - prefer to get rid of it. - - - You can use either the precommit or pretxncommit hook to tell whether you - have a trailing whitespace problem. If you use the precommit hook, the hook will not know - which files you are committing, so it will have to check every - modified file in the repository for trailing white space. If - you want to commit a change to just the file - foo, but the file - bar contains trailing whitespace, doing a - check in the precommit hook - will prevent you from committing foo due - to the problem with bar. This doesn't - seem right. - - - Should you choose the pretxncommit hook, the check won't - occur until just before the transaction for the commit - completes. This will allow you to check for problems only the - exact files that are being committed. However, if you entered - the commit message interactively and the hook fails, the - transaction will roll back; you'll have to re-enter the commit - message after you fix the trailing whitespace and run hg commit again. - - -&interaction.hook.ws.simple; - - In this example, we introduce a simple pretxncommit hook that checks for - trailing whitespace. This hook is short, but not very - helpful. It exits with an error status if a change adds a - line with trailing whitespace to any file, but does not print - any information that might help us to identify the offending - file or line. It also has the nice property of not paying - attention to unmodified lines; only lines that introduce new - trailing whitespace cause problems. - - - The above version is much more complex, but also more - useful. It parses a unified diff to see if any lines add - trailing whitespace, and prints the name of the file and the - line number of each such occurrence. Even better, if the - change adds trailing whitespace, this hook saves the commit - comment and prints the name of the save file before exiting - and telling Mercurial to roll the transaction back, so you can - use the - option to hg commit to reuse - the saved commit message once you've corrected the problem. - - -&interaction.hook.ws.better; - - As a final aside, note in the example above the use of - perl's in-place editing feature to get rid - of trailing whitespace from a file. This is concise and - useful enough that I will reproduce it here. - - perl -pi -e 's,\s+$,,' filename - - - - - Bundled hooks - - Mercurial ships with several bundled hooks. You can find - them in the hgext - directory of a Mercurial source tree. If you are using a - Mercurial binary package, the hooks will be located in the - hgext directory of - wherever your package installer put Mercurial. - - - - <literal role="hg-ext">acl</literal>&emdash;access - control for parts of a repository - - The acl extension lets - you control which remote users are allowed to push changesets - to a networked server. You can protect any portion of a - repository (including the entire repo), so that a specific - remote user can push changes that do not affect the protected - portion. - - - This extension implements access control based on the - identity of the user performing a push, - not on who committed the changesets - they're pushing. It makes sense to use this hook only if you - have a locked-down server environment that authenticates - remote users, and you want to be sure that only specific users - are allowed to push changes to that server. - - - - Configuring the <literal role="hook">acl</literal> - hook - - In order to manage incoming changesets, the acl hook must be used as a - pretxnchangegroup hook. This - lets it see which files are modified by each incoming - changeset, and roll back a group of changesets if they - modify forbidden files. Example: - - [hooks] -pretxnchangegroup.acl = python:hgext.acl.hook - - The acl extension is - configured using three sections. - - - The acl section has - only one entry, sources, - which lists the sources of incoming changesets that the hook - should pay attention to. You don't normally need to - configure this section. - - - serve: - Control incoming changesets that are arriving from a - remote repository over http or ssh. This is the default - value of sources, and - usually the only setting you'll need for this - configuration item. - - - pull: - Control incoming changesets that are arriving via a pull - from a local repository. - - - push: - Control incoming changesets that are arriving via a push - from a local repository. - - - bundle: - Control incoming changesets that are arriving from - another repository via a bundle. - - - - The acl.allow - section controls the users that are allowed to add - changesets to the repository. If this section is not - present, all users that are not explicitly denied are - allowed. If this section is present, all users that are not - explicitly allowed are denied (so an empty section means - that all users are denied). - - - The acl.deny - section determines which users are denied from adding - changesets to the repository. If this section is not - present or is empty, no users are denied. - - - The syntaxes for the acl.allow and acl.deny sections are - identical. On the left of each entry is a glob pattern that - matches files or directories, relative to the root of the - repository; on the right, a user name. - - - In the following example, the user - docwriter can only push changes to the - docs subtree of the - repository, while intern can push changes - to any file or directory except source/sensitive. - - [acl.allow] -docs/** = docwriter -[acl.deny] -source/sensitive/** = intern - - - - Testing and troubleshooting - - If you want to test the acl hook, run it with Mercurial's - debugging output enabled. Since you'll probably be running - it on a server where it's not convenient (or sometimes - possible) to pass in the option, don't forget - that you can enable debugging output in your ~/.hgrc: - - [ui] -debug = true - With this enabled, the acl hook will print enough - information to let you figure out why it is allowing or - forbidding pushes from specific users. - - - - - - <literal - role="hg-ext">bugzilla</literal>&emdash;integration with - Bugzilla - - The bugzilla extension - adds a comment to a Bugzilla bug whenever it finds a reference - to that bug ID in a commit comment. You can install this hook - on a shared server, so that any time a remote user pushes - changes to this server, the hook gets run. - - - It adds a comment to the bug that looks like this (you can - configure the contents of the comment&emdash;see below): - - Changeset aad8b264143a, made by Joe User - <joe.user@domain.com> in the frobnitz repository, refers - to this bug. For complete details, see - http://hg.domain.com/frobnitz?cmd=changeset;node=aad8b264143a - Changeset description: Fix bug 10483 by guarding against some - NULL pointers - The value of this hook is that it automates the process of - updating a bug any time a changeset refers to it. If you - configure the hook properly, it makes it easy for people to - browse straight from a Bugzilla bug to a changeset that refers - to that bug. - - - You can use the code in this hook as a starting point for - some more exotic Bugzilla integration recipes. Here are a few - possibilities: - - - Require that every changeset pushed to the - server have a valid bug ID in its commit comment. In this - case, you'd want to configure the hook as a pretxncommit hook. This would - allow the hook to reject changes that didn't contain bug - IDs. - - - Allow incoming changesets to automatically - modify the state of a bug, as well as - simply adding a comment. For example, the hook could - recognise the string fixed bug 31337 as - indicating that it should update the state of bug 31337 to - requires testing. - - - - - Configuring the <literal role="hook">bugzilla</literal> - hook - - You should configure this hook in your server's - ~/.hgrc as an incoming hook, for example as - follows: - - [hooks] -incoming.bugzilla = python:hgext.bugzilla.hook - - Because of the specialised nature of this hook, and - because Bugzilla was not written with this kind of - integration in mind, configuring this hook is a somewhat - involved process. - - - Before you begin, you must install the MySQL bindings - for Python on the host(s) where you'll be running the hook. - If this is not available as a binary package for your - system, you can download it from - web:mysql-python. - - - Configuration information for this hook lives in the - bugzilla section of - your ~/.hgrc. - - - version: The version - of Bugzilla installed on the server. The database - schema that Bugzilla uses changes occasionally, so this - hook has to know exactly which schema to use. At the - moment, the only version supported is - 2.16. - - - host: - The hostname of the MySQL server that stores your - Bugzilla data. The database must be configured to allow - connections from whatever host you are running the - bugzilla hook on. - - - user: - The username with which to connect to the MySQL server. - The database must be configured to allow this user to - connect from whatever host you are running the bugzilla hook on. This user - must be able to access and modify Bugzilla tables. The - default value of this item is bugs, - which is the standard name of the Bugzilla user in a - MySQL database. - - - password: The MySQL - password for the user you configured above. This is - stored as plain text, so you should make sure that - unauthorised users cannot read the ~/.hgrc file where you - store this information. - - - db: - The name of the Bugzilla database on the MySQL server. - The default value of this item is - bugs, which is the standard name of - the MySQL database where Bugzilla stores its data. - - - notify: If you want - Bugzilla to send out a notification email to subscribers - after this hook has added a comment to a bug, you will - need this hook to run a command whenever it updates the - database. The command to run depends on where you have - installed Bugzilla, but it will typically look something - like this, if you have Bugzilla installed in /var/www/html/bugzilla: - - cd /var/www/html/bugzilla && - ./processmail %s nobody@nowhere.com - - The Bugzilla - processmail program expects to be - given a bug ID (the hook replaces - %s with the bug ID) - and an email address. It also expects to be able to - write to some files in the directory that it runs in. - If Bugzilla and this hook are not installed on the same - machine, you will need to find a way to run - processmail on the server where - Bugzilla is installed. - - - - - - Mapping committer names to Bugzilla user names - - By default, the bugzilla hook tries to use the - email address of a changeset's committer as the Bugzilla - user name with which to update a bug. If this does not suit - your needs, you can map committer email addresses to - Bugzilla user names using a usermap section. - - - Each item in the usermap section contains an - email address on the left, and a Bugzilla user name on the - right. - - [usermap] -jane.user@example.com = jane - You can either keep the usermap data in a normal - ~/.hgrc, or tell the - bugzilla hook to read the - information from an external usermap - file. In the latter case, you can store - usermap data by itself in (for example) - a user-modifiable repository. This makes it possible to let - your users maintain their own usermap entries. The main - ~/.hgrc file might look - like this: - - # regular hgrc file refers to external usermap file -[bugzilla] -usermap = /home/hg/repos/userdata/bugzilla-usermap.conf - While the usermap file that it - refers to might look like this: - - # bugzilla-usermap.conf - inside a hg repository -[usermap] stephanie@example.com = steph - - - - Configuring the text that gets added to a bug - - You can configure the text that this hook adds as a - comment; you specify it in the form of a Mercurial template. - Several ~/.hgrc entries - (still in the bugzilla - section) control this behavior. - - - strip: The number of - leading path elements to strip from a repository's path - name to construct a partial path for a URL. For example, - if the repositories on your server live under /home/hg/repos, and you - have a repository whose path is /home/hg/repos/app/tests, - then setting strip to - 4 will give a partial path of - app/tests. The - hook will make this partial path available when - expanding a template, as webroot. - - - template: The text of the - template to use. In addition to the usual - changeset-related variables, this template can use - hgweb (the value of the - hgweb configuration item above) and - webroot (the path constructed using - strip above). - - - - In addition, you can add a baseurl item to the web section of your ~/.hgrc. The bugzilla hook will make this - available when expanding a template, as the base string to - use when constructing a URL that will let users browse from - a Bugzilla comment to view a changeset. Example: - - [web] -baseurl = http://hg.domain.com/ - - Here is an example set of bugzilla hook config information. - - - &ch10-bugzilla-config.lst; - - - - Testing and troubleshooting - - The most common problems with configuring the bugzilla hook relate to running - Bugzilla's processmail script and - mapping committer names to user names. - - - Recall from above that the user - that runs the Mercurial process on the server is also the - one that will run the processmail - script. The processmail script - sometimes causes Bugzilla to write to files in its - configuration directory, and Bugzilla's configuration files - are usually owned by the user that your web server runs - under. - - - You can cause processmail to be run - with the suitable user's identity using the - sudo command. Here is an example entry - for a sudoers file. - - hg_user = (httpd_user) -NOPASSWD: /var/www/html/bugzilla/processmail-wrapper %s - This allows the hg_user user to run a - processmail-wrapper program under the - identity of httpd_user. - - - This indirection through a wrapper script is necessary, - because processmail expects to be run - with its current directory set to wherever you installed - Bugzilla; you can't specify that kind of constraint in a - sudoers file. The contents of the - wrapper script are simple: - - #!/bin/sh -cd `dirname $0` && ./processmail "$1" nobody@example.com - It doesn't seem to matter what email address you pass to - processmail. - - - If your usermap is - not set up correctly, users will see an error message from - the bugzilla hook when they - push changes to the server. The error message will look - like this: - - cannot find bugzilla user id for john.q.public@example.com - What this means is that the committer's address, - john.q.public@example.com, is not a valid - Bugzilla user name, nor does it have an entry in your - usermap that maps it to - a valid Bugzilla user name. - - - - - - <literal role="hg-ext">notify</literal>&emdash;send email - notifications - - Although Mercurial's built-in web server provides RSS - feeds of changes in every repository, many people prefer to - receive change notifications via email. The notify hook lets you send out - notifications to a set of email addresses whenever changesets - arrive that those subscribers are interested in. - - - As with the bugzilla - hook, the notify hook is - template-driven, so you can customise the contents of the - notification messages that it sends. - - - By default, the notify - hook includes a diff of every changeset that it sends out; you - can limit the size of the diff, or turn this feature off - entirely. It is useful for letting subscribers review changes - immediately, rather than clicking to follow a URL. - - - - Configuring the <literal role="hg-ext">notify</literal> - hook - - You can set up the notify hook to send one email - message per incoming changeset, or one per incoming group of - changesets (all those that arrived in a single pull or - push). - - [hooks] -# send one email per group of changes -changegroup.notify = python:hgext.notify.hook -# send one email per change -incoming.notify = python:hgext.notify.hook - - Configuration information for this hook lives in the - notify section of a - ~/.hgrc file. - - - test: - By default, this hook does not send out email at all; - instead, it prints the message that it - would send. Set this item to - false to allow email to be sent. The - reason that sending of email is turned off by default is - that it takes several tries to configure this extension - exactly as you would like, and it would be bad form to - spam subscribers with a number of broken - notifications while you debug your configuration. - - - config: - The path to a configuration file that contains - subscription information. This is kept separate from - the main ~/.hgrc so - that you can maintain it in a repository of its own. - People can then clone that repository, update their - subscriptions, and push the changes back to your server. - - - strip: - The number of leading path separator characters to strip - from a repository's path, when deciding whether a - repository has subscribers. For example, if the - repositories on your server live in /home/hg/repos, and - notify is considering a - repository named /home/hg/repos/shared/test, - setting strip to - 4 will cause notify to trim the path it - considers down to shared/test, and it will - match subscribers against that. - - - template: The template - text to use when sending messages. This specifies both - the contents of the message header and its body. - - - maxdiff: The maximum - number of lines of diff data to append to the end of a - message. If a diff is longer than this, it is - truncated. By default, this is set to 300. Set this to - 0 to omit diffs from notification - emails. - - - sources: A list of - sources of changesets to consider. This lets you limit - notify to only sending - out email about changes that remote users pushed into - this repository via a server, for example. See - for the sources you - can specify here. - - - - If you set the baseurl - item in the web section, - you can use it in a template; it will be available as - webroot. - - - Here is an example set of notify configuration information. - - - &ch10-notify-config.lst; - - This will produce a message that looks like the - following: - - - &ch10-notify-config-mail.lst; - - - - Testing and troubleshooting - - Do not forget that by default, the notify extension will not - send any mail until you explicitly configure it to do so, - by setting test to - false. Until you do that, it simply - prints the message it would send. - - - - - - - Information for writers of hooks - - - In-process hook execution - - An in-process hook is called with arguments of the - following form: - - def myhook(ui, repo, **kwargs): pass - The ui parameter is a ui object. The - repo parameter is a localrepository - object. The names and values of the - **kwargs parameters depend on the hook - being invoked, with the following common features: - - - If a parameter is named - node or parentN, it - will contain a hexadecimal changeset ID. The empty string - is used to represent null changeset ID - instead of a string of zeroes. - - - If a parameter is named - url, it will contain the URL of a - remote repository, if that can be determined. - - - Boolean-valued parameters are represented as - Python bool objects. - - - - An in-process hook is called without a change to the - process's working directory (unlike external hooks, which are - run in the root of the repository). It must not change the - process's working directory, or it will cause any calls it - makes into the Mercurial API to fail. - - - If a hook returns a boolean false value, it - is considered to have succeeded. If it returns a boolean - true value or raises an exception, it is - considered to have failed. A useful way to think of the - calling convention is tell me if you fail. - - - Note that changeset IDs are passed into Python hooks as - hexadecimal strings, not the binary hashes that Mercurial's - APIs normally use. To convert a hash from hex to binary, use - the bin function. - - - - - External hook execution - - An external hook is passed to the shell of the user - running Mercurial. Features of that shell, such as variable - substitution and command redirection, are available. The hook - is run in the root directory of the repository (unlike - in-process hooks, which are run in the same directory that - Mercurial was run in). - - - Hook parameters are passed to the hook as environment - variables. Each environment variable's name is converted in - upper case and prefixed with the string - HG_. For example, if the - name of a parameter is node, - the name of the environment variable representing that - parameter will be HG_NODE. - - - A boolean parameter is represented as the string - 1 for true, - 0 for false. - If an environment variable is named HG_NODE, - HG_PARENT1 or HG_PARENT2, it - contains a changeset ID represented as a hexadecimal string. - The empty string is used to represent null changeset - ID instead of a string of zeroes. If an environment - variable is named HG_URL, it will contain the - URL of a remote repository, if that can be determined. - - - If a hook exits with a status of zero, it is considered to - have succeeded. If it exits with a non-zero status, it is - considered to have failed. - - - - - Finding out where changesets come from - - A hook that involves the transfer of changesets between a - local repository and another may be able to find out - information about the far side. Mercurial - knows how changes are being transferred, - and in many cases where they are being - transferred to or from. - - - - Sources of changesets - - Mercurial will tell a hook what means are, or were, used - to transfer changesets between repositories. This is - provided by Mercurial in a Python parameter named - source, or an environment variable named - HG_SOURCE. - - - - serve: Changesets are - transferred to or from a remote repository over http or - ssh. - - - pull: Changesets are - being transferred via a pull from one repository into - another. - - - push: Changesets are - being transferred via a push from one repository into - another. - - - bundle: Changesets are - being transferred to or from a bundle. - - - - - - Where changes are going&emdash;remote repository - URLs - - When possible, Mercurial will tell a hook the location - of the far side of an activity that transfers - changeset data between repositories. This is provided by - Mercurial in a Python parameter named - url, or an environment variable named - HG_URL. - - - This information is not always known. If a hook is - invoked in a repository that is being served via http or - ssh, Mercurial cannot tell where the remote repository is, - but it may know where the client is connecting from. In - such cases, the URL will take one of the following forms: - - - remote:ssh:1.2.3.4&emdash;remote - ssh client, at the IP address - 1.2.3.4. - - - remote:http:1.2.3.4&emdash;remote - http client, at the IP address - 1.2.3.4. If the client is using SSL, - this will be of the form - remote:https:1.2.3.4. - - - Empty&emdash;no information could be - discovered about the remote client. - - - - - - - - Hook reference - - - <literal role="hook">changegroup</literal>&emdash;after - remote changesets added - - This hook is run after a group of pre-existing changesets - has been added to the repository, for example via a hg pull or hg - unbundle. This hook is run once per operation - that added one or more changesets. This is in contrast to the - incoming hook, which is run - once per changeset, regardless of whether the changesets - arrive in a group. - - - Some possible uses for this hook include kicking off an - automated build or test of the added changesets, updating a - bug database, or notifying subscribers that a repository - contains new changes. - - - Parameters to this hook: - - - node: A changeset ID. The - changeset ID of the first changeset in the group that was - added. All changesets between this and - tip, inclusive, were added by a single - hg pull, hg push or hg unbundle. - - - source: A - string. The source of these changes. See for details. - - - url: A URL. The - location of the remote repository, if known. See for more information. - - - - See also: incoming (), prechangegroup (), pretxnchangegroup () - - - - - <literal role="hook">commit</literal>&emdash;after a new - changeset is created - - This hook is run after a new changeset has been created. - - - Parameters to this hook: - - - node: A changeset ID. The - changeset ID of the newly committed changeset. - - - parent1: A changeset ID. - The changeset ID of the first parent of the newly - committed changeset. - - - parent2: A changeset ID. - The changeset ID of the second parent of the newly - committed changeset. - - - - See also: precommit (), pretxncommit () - - - - - <literal role="hook">incoming</literal>&emdash;after one - remote changeset is added - - This hook is run after a pre-existing changeset has been - added to the repository, for example via a hg push. If a group of changesets - was added in a single operation, this hook is called once for - each added changeset. - - - You can use this hook for the same purposes as - the changegroup hook (); it's simply more - convenient sometimes to run a hook once per group of - changesets, while other times it's handier once per changeset. - - - Parameters to this hook: - - - node: A changeset ID. The - ID of the newly added changeset. - - - source: A - string. The source of these changes. See for details. - - - url: A URL. The - location of the remote repository, if known. See for more information. - - - - See also: changegroup () prechangegroup (), pretxnchangegroup () - - - - - <literal role="hook">outgoing</literal>&emdash;after - changesets are propagated - - This hook is run after a group of changesets has been - propagated out of this repository, for example by a hg push or hg - bundle command. - - - One possible use for this hook is to notify administrators - that changes have been pulled. - - - Parameters to this hook: - - - node: A changeset ID. The - changeset ID of the first changeset of the group that was - sent. - - - source: A string. The - source of the of the operation (see ). If a remote - client pulled changes from this repository, - source will be - serve. If the client that obtained - changes from this repository was local, - source will be - bundle, pull, or - push, depending on the operation the - client performed. - - - url: A URL. The - location of the remote repository, if known. See for more information. - - - - See also: preoutgoing () - - - - - <literal - role="hook">prechangegroup</literal>&emdash;before starting - to add remote changesets - - This controlling hook is run before Mercurial begins to - add a group of changesets from another repository. - - - This hook does not have any information about the - changesets to be added, because it is run before transmission - of those changesets is allowed to begin. If this hook fails, - the changesets will not be transmitted. - - - One use for this hook is to prevent external changes from - being added to a repository. For example, you could use this - to freeze a server-hosted branch temporarily or - permanently so that users cannot push to it, while still - allowing a local administrator to modify the repository. - - - Parameters to this hook: - - - source: A string. The - source of these changes. See for details. - - - url: A URL. The - location of the remote repository, if known. See for more information. - - - - See also: changegroup (), incoming (), pretxnchangegroup () - - - - - <literal role="hook">precommit</literal>&emdash;before - starting to commit a changeset - - This hook is run before Mercurial begins to commit a new - changeset. It is run before Mercurial has any of the metadata - for the commit, such as the files to be committed, the commit - message, or the commit date. - - - One use for this hook is to disable the ability to commit - new changesets, while still allowing incoming changesets. - Another is to run a build or test, and only allow the commit - to begin if the build or test succeeds. - - - Parameters to this hook: - - - parent1: A changeset ID. - The changeset ID of the first parent of the working - directory. - - - parent2: A changeset ID. - The changeset ID of the second parent of the working - directory. - - - If the commit proceeds, the parents of the working - directory will become the parents of the new changeset. - - - See also: commit - (), pretxncommit () - - - - - <literal role="hook">preoutgoing</literal>&emdash;before - starting to propagate changesets - - This hook is invoked before Mercurial knows the identities - of the changesets to be transmitted. - - - One use for this hook is to prevent changes from being - transmitted to another repository. - - - Parameters to this hook: - - - source: A - string. The source of the operation that is attempting to - obtain changes from this repository (see ). See the documentation - for the source parameter to the - outgoing hook, in - , for possible values - of this parameter. - - - url: A URL. The - location of the remote repository, if known. See for more information. - - - - See also: outgoing () - - - - - <literal role="hook">pretag</literal>&emdash;before - tagging a changeset - - This controlling hook is run before a tag is created. If - the hook succeeds, creation of the tag proceeds. If the hook - fails, the tag is not created. - - - Parameters to this hook: - - - local: A boolean. Whether - the tag is local to this repository instance (i.e. stored - in .hg/localtags) or - managed by Mercurial (stored in .hgtags). - - - node: A changeset ID. The - ID of the changeset to be tagged. - - - tag: A string. The name of - the tag to be created. - - - - If the tag to be created is - revision-controlled, the precommit and pretxncommit hooks ( and ) will also be run. - - - See also: tag - () - - - - <literal - role="hook">pretxnchangegroup</literal>&emdash;before - completing addition of remote changesets - - This controlling hook is run before a - transaction&emdash;that manages the addition of a group of new - changesets from outside the repository&emdash;completes. If - the hook succeeds, the transaction completes, and all of the - changesets become permanent within this repository. If the - hook fails, the transaction is rolled back, and the data for - the changesets is erased. - - - This hook can access the metadata associated with the - almost-added changesets, but it should not do anything - permanent with this data. It must also not modify the working - directory. - - - While this hook is running, if other Mercurial processes - access this repository, they will be able to see the - almost-added changesets as if they are permanent. This may - lead to race conditions if you do not take steps to avoid - them. - - - This hook can be used to automatically vet a group of - changesets. If the hook fails, all of the changesets are - rejected when the transaction rolls back. - - - Parameters to this hook: - - - node: A changeset ID. The - changeset ID of the first changeset in the group that was - added. All changesets between this and - tip, - inclusive, were added by a single hg pull, hg push or hg unbundle. - - - source: A - string. The source of these changes. See for details. - - - url: A URL. The - location of the remote repository, if known. See for more information. - - - - See also: changegroup (), incoming (), prechangegroup () - - - - - <literal role="hook">pretxncommit</literal>&emdash;before - completing commit of new changeset - - This controlling hook is run before a - transaction&emdash;that manages a new commit&emdash;completes. - If the hook succeeds, the transaction completes and the - changeset becomes permanent within this repository. If the - hook fails, the transaction is rolled back, and the commit - data is erased. - - - This hook can access the metadata associated with the - almost-new changeset, but it should not do anything permanent - with this data. It must also not modify the working - directory. - - - While this hook is running, if other Mercurial processes - access this repository, they will be able to see the - almost-new changeset as if it is permanent. This may lead to - race conditions if you do not take steps to avoid them. - - - Parameters to this hook: - - - node: A changeset ID. The - changeset ID of the newly committed changeset. - - - parent1: A changeset ID. - The changeset ID of the first parent of the newly - committed changeset. - - - parent2: A changeset ID. - The changeset ID of the second parent of the newly - committed changeset. - - - - See also: precommit () - - - - - <literal role="hook">preupdate</literal>&emdash;before - updating or merging working directory - - This controlling hook is run before an update - or merge of the working directory begins. It is run only if - Mercurial's normal pre-update checks determine that the update - or merge can proceed. If the hook succeeds, the update or - merge may proceed; if it fails, the update or merge does not - start. - - - Parameters to this hook: - - - parent1: A - changeset ID. The ID of the parent that the working - directory is to be updated to. If the working directory - is being merged, it will not change this parent. - - - parent2: A - changeset ID. Only set if the working directory is being - merged. The ID of the revision that the working directory - is being merged with. - - - - See also: update - () - - - - <literal role="hook">tag</literal>&emdash;after tagging a - changeset - - This hook is run after a tag has been created. - - - Parameters to this hook: - - - local: A boolean. Whether - the new tag is local to this repository instance (i.e. - stored in .hg/localtags) or managed by - Mercurial (stored in .hgtags). - - - node: A changeset ID. The - ID of the changeset that was tagged. - - - tag: A string. The name of - the tag that was created. - - - - If the created tag is revision-controlled, the commit hook (section ) is run before this hook. - - - See also: pretag - () - - - - - <literal role="hook">update</literal>&emdash;after - updating or merging working directory - - This hook is run after an update or merge of the working - directory completes. Since a merge can fail (if the external - hgmerge command fails to resolve conflicts - in a file), this hook communicates whether the update or merge - completed cleanly. - - - - error: A boolean. - Indicates whether the update or merge completed - successfully. - - - parent1: A changeset ID. - The ID of the parent that the working directory was - updated to. If the working directory was merged, it will - not have changed this parent. - - - parent2: A changeset ID. - Only set if the working directory was merged. The ID of - the revision that the working directory was merged with. - - - - See also: preupdate - () - - - - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch09-undo.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch09-undo.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,1201 @@ + + + + + Finding and fixing mistakes + + To err might be human, but to really handle the consequences + well takes a top-notch revision control system. In this chapter, + we'll discuss some of the techniques you can use when you find + that a problem has crept into your project. Mercurial has some + highly capable features that will help you to isolate the sources + of problems, and to handle them appropriately. + + + Erasing local history + + + The accidental commit + + I have the occasional but persistent problem of typing + rather more quickly than I can think, which sometimes results + in me committing a changeset that is either incomplete or + plain wrong. In my case, the usual kind of incomplete + changeset is one in which I've created a new source file, but + forgotten to hg add it. A + plain wrong changeset is not as common, but no + less annoying. + + + + Rolling back a transaction + + In , I + mentioned that Mercurial treats each modification of a + repository as a transaction. Every time + you commit a changeset or pull changes from another + repository, Mercurial remembers what you did. You can undo, + or roll back, exactly one of these + actions using the hg rollback + command. (See + for an important caveat about the use of this command.) + + Here's a mistake that I often find myself making: + committing a change in which I've created a new file, but + forgotten to hg add + it. + + &interaction.rollback.commit; + + Looking at the output of hg + status after the commit immediately confirms the + error. + + &interaction.rollback.status; + + The commit captured the changes to the file + a, but not the new file + b. If I were to push this changeset to a + repository that I shared with a colleague, the chances are + high that something in a would refer to + b, which would not be present in their + repository when they pulled my changes. I would thus become + the object of some indignation. + + However, luck is with me&emdash;I've caught my error + before I pushed the changeset. I use the hg rollback command, and Mercurial + makes that last changeset vanish. + + &interaction.rollback.rollback; + + Notice that the changeset is no longer present in the + repository's history, and the working directory once again + thinks that the file a is modified. The + commit and rollback have left the working directory exactly as + it was prior to the commit; the changeset has been completely + erased. I can now safely hg + add the file b, and rerun my + commit. + + &interaction.rollback.add; + + + + The erroneous pull + + It's common practice with Mercurial to maintain separate + development branches of a project in different repositories. + Your development team might have one shared repository for + your project's 0.9 release, and another, + containing different changes, for the 1.0 + release. + + Given this, you can imagine that the consequences could be + messy if you had a local 0.9 repository, and + accidentally pulled changes from the shared 1.0 + repository into it. At worst, you could be paying + insufficient attention, and push those changes into the shared + 0.9 tree, confusing your entire team (but don't + worry, we'll return to this horror scenario later). However, + it's more likely that you'll notice immediately, because + Mercurial will display the URL it's pulling from, or you will + see it pull a suspiciously large number of changes into the + repository. + + The hg rollback command + will work nicely to expunge all of the changesets that you + just pulled. Mercurial groups all changes from one hg pull into a single transaction, + so one hg rollback is all you + need to undo this mistake. + + + + Rolling back is useless once you've pushed + + The value of the hg + rollback command drops to zero once you've pushed + your changes to another repository. Rolling back a change + makes it disappear entirely, but only in + the repository in which you perform the hg rollback. Because a rollback + eliminates history, there's no way for the disappearance of a + change to propagate between repositories. + + If you've pushed a change to another + repository&emdash;particularly if it's a shared + repository&emdash;it has essentially escaped into the + wild, and you'll have to recover from your mistake + in a different way. If you push a changeset somewhere, then + roll it back, then pull from the repository you pushed to, the + changeset you thought you'd gotten rid of will simply reappear + in your repository. + + (If you absolutely know for sure that the change + you want to roll back is the most recent change in the + repository that you pushed to, and you + know that nobody else could have pulled it from that + repository, you can roll back the changeset there, too, but + you really should not expect this to work reliably. Sooner or + later a change really will make it into a repository that you + don't directly control (or have forgotten about), and come + back to bite you.) + + + + You can only roll back once + + Mercurial stores exactly one transaction in its + transaction log; that transaction is the most recent one that + occurred in the repository. This means that you can only roll + back one transaction. If you expect to be able to roll back + one transaction, then its predecessor, this is not the + behavior you will get. + + &interaction.rollback.twice; + + Once you've rolled back one transaction in a repository, + you can't roll back again in that repository until you perform + another commit or pull. + + + + + Reverting the mistaken change + + If you make a modification to a file, and decide that you + really didn't want to change the file at all, and you haven't + yet committed your changes, the hg + revert command is the one you'll need. It looks at + the changeset that's the parent of the working directory, and + restores the contents of the file to their state as of that + changeset. (That's a long-winded way of saying that, in the + normal case, it undoes your modifications.) + + Let's illustrate how the hg + revert command works with yet another small example. + We'll begin by modifying a file that Mercurial is already + tracking. + + &interaction.daily.revert.modify; + + If we don't + want that change, we can simply hg + revert the file. + + &interaction.daily.revert.unmodify; + + The hg revert command + provides us with an extra degree of safety by saving our + modified file with a .orig + extension. + + &interaction.daily.revert.status; + + + Be careful with <filename>.orig</filename> files + + It's extremely unlikely that you are either using + Mercurial to manage files with .orig + extensions or that you even care about the contents of such + files. Just in case, though, it's useful to remember that + hg revert will + unconditionally overwrite an existing file with a + .orig extension. For instance, if you + already have a file named foo.orig when + you revert foo, the contents of + foo.orig will be clobbered. + + + Here is a summary of the cases that the hg revert command can deal with. We + will describe each of these in more detail in the section that + follows. + + If you modify a file, it will restore the file + to its unmodified state. + + If you hg add a + file, it will undo the added state of the + file, but leave the file itself untouched. + + If you delete a file without telling Mercurial, + it will restore the file to its unmodified contents. + + If you use the hg + remove command to remove a file, it will undo + the removed state of the file, and restore + the file to its unmodified contents. + + + + File management errors + + The hg revert command is + useful for more than just modified files. It lets you reverse + the results of all of Mercurial's file management + commands&emdash;hg add, + hg remove, and so on. + + If you hg add a file, + then decide that in fact you don't want Mercurial to track it, + use hg revert to undo the + add. Don't worry; Mercurial will not modify the file in any + way. It will just unmark the file. + + &interaction.daily.revert.add; + + Similarly, if you ask Mercurial to hg remove a file, you can use + hg revert to restore it to + the contents it had as of the parent of the working directory. + &interaction.daily.revert.remove; This works just as + well for a file that you deleted by hand, without telling + Mercurial (recall that in Mercurial terminology, this kind of + file is called missing). + + &interaction.daily.revert.missing; + + If you revert a hg copy, + the copied-to file remains in your working directory + afterwards, untracked. Since a copy doesn't affect the + copied-from file in any way, Mercurial doesn't do anything + with the copied-from file. + + &interaction.daily.revert.copy; + + + + + Dealing with committed changes + + Consider a case where you have committed a change + a, and another change + b on top of it; you then realise that + change a was incorrect. Mercurial lets you + back out an entire changeset automatically, and + building blocks that let you reverse part of a changeset by + hand. + + Before you read this section, here's something to + keep in mind: the hg backout + command undoes the effect of a change by + adding to your repository's history, not by + modifying or erasing it. It's the right tool to use if you're + fixing bugs, but not if you're trying to undo some change that + has catastrophic consequences. To deal with those, see + . + + + Backing out a changeset + + The hg backout command + lets you undo the effects of an entire + changeset in an automated fashion. Because Mercurial's + history is immutable, this command does + not get rid of the changeset you want to undo. + Instead, it creates a new changeset that + reverses the effect of the to-be-undone + changeset. + + The operation of the hg + backout command is a little intricate, so let's + illustrate it with some examples. First, we'll create a + repository with some simple changes. + + &interaction.backout.init; + + The hg backout command + takes a single changeset ID as its argument; this is the + changeset to back out. Normally, hg + backout will drop you into a text editor to write + a commit message, so you can record why you're backing the + change out. In this example, we provide a commit message on + the command line using the option. + + + + Backing out the tip changeset + + We're going to start by backing out the last changeset we + committed. + + &interaction.backout.simple; + + You can see that the second line from + myfile is no longer present. Taking a + look at the output of hg log + gives us an idea of what the hg + backout command has done. + &interaction.backout.simple.log; Notice that the new changeset + that hg backout has created + is a child of the changeset we backed out. It's easier to see + this in , which presents a + graphical view of the change history. As you can see, the + history is nice and linear. + +
+ Backing out a change using the <command + role="hg-cmd">hg backout</command> command + + + XXX add text + +
+ +
+ + Backing out a non-tip change + + If you want to back out a change other than the last one + you committed, pass the option to the + hg backout command. + + &interaction.backout.non-tip.clone; + + This makes backing out any changeset a + one-shot operation that's usually simple and + fast. + + &interaction.backout.non-tip.backout; + + If you take a look at the contents of + myfile after the backout finishes, you'll + see that the first and third changes are present, but not the + second. + + &interaction.backout.non-tip.cat; + + As the graphical history in illustrates, Mercurial + still commits one change in this kind of situation (the + box-shaped node is the ones that Mercurial commits + automatically), but the revision graph now looks different. + Before Mercurial begins the backout process, it first + remembers what the current parent of the working directory is. + It then backs out the target changeset, and commits that as a + changeset. Finally, it merges back to the previous parent of + the working directory, but notice that it does not + commit the result of the merge. The repository + now contains two heads, and the working directory is in a + merge state. + +
+ Automated backout of a non-tip change using the + <command role="hg-cmd">hg backout</command> command + + + XXX add text + +
+ + The result is that you end up back where you + were, only with some extra history that undoes the + effect of the changeset you wanted to back out. + + You might wonder why Mercurial does not commit the result + of the merge that it performed. The reason lies in Mercurial + behaving conservatively: a merge naturally has more scope for + error than simply undoing the effect of the tip changeset, + so your work will be safest if you first inspect (and test!) + the result of the merge, then commit + it. + + + Always use the <option + role="hg-opt-backout">--merge</option> option + + In fact, since the option will do the + right thing whether or not the changeset + you're backing out is the tip (i.e. it won't try to merge if + it's backing out the tip, since there's no need), you should + always use this option when you run the + hg backout command. + + +
+ + Gaining more control of the backout process + + While I've recommended that you always use the option when backing + out a change, the hg backout + command lets you decide how to merge a backout changeset. + Taking control of the backout process by hand is something you + will rarely need to do, but it can be useful to understand + what the hg backout command + is doing for you automatically. To illustrate this, let's + clone our first repository, but omit the backout change that + it contains. + + &interaction.backout.manual.clone; + + As with our + earlier example, We'll commit a third changeset, then back out + its parent, and see what happens. + + &interaction.backout.manual.backout; + + Our new changeset is again a descendant of the changeset + we backout out; it's thus a new head, not + a descendant of the changeset that was the tip. The hg backout command was quite + explicit in telling us this. + + &interaction.backout.manual.log; + + Again, it's easier to see what has happened by looking at + a graph of the revision history, in . This makes it clear + that when we use hg backout + to back out a change other than the tip, Mercurial adds a new + head to the repository (the change it committed is + box-shaped). + +
+ Backing out a change using the <command + role="hg-cmd">hg backout</command> command + + + XXX add text + +
+ + After the hg backout + command has completed, it leaves the new + backout changeset as the parent of the working + directory. + + &interaction.backout.manual.parents; + + Now we have two isolated sets of changes. + + &interaction.backout.manual.heads; + + Let's think about what we expect to see as the contents of + myfile now. The first change should be + present, because we've never backed it out. The second change + should be missing, as that's the change we backed out. Since + the history graph shows the third change as a separate head, + we don't expect to see the third change + present in myfile. + + &interaction.backout.manual.cat; + + To get the third change back into the file, we just do a + normal merge of our two heads. + + &interaction.backout.manual.merge; + + Afterwards, the graphical history of our + repository looks like + . + +
+ Manually merging a backout change + + + XXX add text + +
+ +
+ + Why <command role="hg-cmd">hg backout</command> works as + it does + + Here's a brief description of how the hg backout command works. + + It ensures that the working directory is + clean, i.e. that the output of hg status would be empty. + + It remembers the current parent of the working + directory. Let's call this changeset + orig. + + It does the equivalent of a hg update to sync the working + directory to the changeset you want to back out. Let's + call this changeset backout. + + It finds the parent of that changeset. Let's + call that changeset parent. + + For each file that the + backout changeset affected, it does the + equivalent of a hg revert -r + parent on that file, to restore it to the + contents it had before that changeset was + committed. + + It commits the result as a new changeset. + This changeset has backout as its + parent. + + If you specify on the command + line, it merges with orig, and commits + the result of the merge. + + + An alternative way to implement the hg backout command would be to + hg export the + to-be-backed-out changeset as a diff, then use the option to the + patch command to reverse the effect of the + change without fiddling with the working directory. This + sounds much simpler, but it would not work nearly as + well. + + The reason that hg + backout does an update, a commit, a merge, and + another commit is to give the merge machinery the best chance + to do a good job when dealing with all the changes + between the change you're backing out and + the current tip. + + If you're backing out a changeset that's 100 revisions + back in your project's history, the chances that the + patch command will be able to apply a + reverse diff cleanly are not good, because intervening changes + are likely to have broken the context that + patch uses to determine whether it can + apply a patch (if this sounds like gibberish, see for a + discussion of the patch command). Also, + Mercurial's merge machinery will handle files and directories + being renamed, permission changes, and modifications to binary + files, none of which patch can deal + with. + + +
+ + Changes that should never have been + + Most of the time, the hg + backout command is exactly what you need if you want + to undo the effects of a change. It leaves a permanent record + of exactly what you did, both when committing the original + changeset and when you cleaned up after it. + + On rare occasions, though, you may find that you've + committed a change that really should not be present in the + repository at all. For example, it would be very unusual, and + usually considered a mistake, to commit a software project's + object files as well as its source files. Object files have + almost no intrinsic value, and they're big, + so they increase the size of the repository and the amount of + time it takes to clone or pull changes. + + Before I discuss the options that you have if you commit a + brown paper bag change (the kind that's so bad + that you want to pull a brown paper bag over your head), let me + first discuss some approaches that probably won't work. + + Since Mercurial treats history as + accumulative&emdash;every change builds on top of all changes + that preceded it&emdash;you generally can't just make disastrous + changes disappear. The one exception is when you've just + committed a change, and it hasn't been pushed or pulled into + another repository. That's when you can safely use the hg rollback command, as I detailed in + . + + After you've pushed a bad change to another repository, you + could still use hg + rollback to make your local copy of the change + disappear, but it won't have the consequences you want. The + change will still be present in the remote repository, so it + will reappear in your local repository the next time you + pull. + + If a situation like this arises, and you know which + repositories your bad change has propagated into, you can + try to get rid of the change from + every one of those repositories. This is, + of course, not a satisfactory solution: if you miss even a + single repository while you're expunging, the change is still + in the wild, and could propagate further. + + If you've committed one or more changes + after the change that you'd like to see + disappear, your options are further reduced. Mercurial doesn't + provide a way to punch a hole in history, leaving + changesets intact. + + + Backing out a merge + + Since merges are often complicated, it is not unheard of + for a merge to be mangled badly, but committed erroneously. + Mercurial provides an important safeguard against bad merges + by refusing to commit unresolved files, but human ingenuity + guarantees that it is still possible to mess a merge up and + commit it. + + Given a bad merge that has been committed, usually the + best way to approach it is to simply try to repair the damage + by hand. A complete disaster that cannot be easily fixed up + by hand ought to be very rare, but the hg backout command may help in + making the cleanup easier. It offers a option, which lets + you specify which parent to revert to when backing out a + merge. + +
+ A bad merge + + + XXX add text + +
+ + Suppose we have a revision graph like that in . What we'd like is to + redo the merge of revisions 2 and + 3. + + One way to do so would be as follows. + + + + Call hg backout --rev=4 + --parent=2. This tells hg backout to back out revision + 4, which is the bad merge, and to when deciding which + revision to prefer, to choose parent 2, one of the parents + of the merge. The effect can be seen in . +
+ Backing out the merge, favoring one parent + + + XXX add text + +
+
+ + + Call hg backout --rev=4 + --parent=3. This tells hg backout to back out revision + 4 again, but this time to choose parent 3, the other + parent of the merge. The result is visible in , in which the repository + now contains three heads. +
+ Backing out the merge, favoring the other + parent + + + XXX add text + +
+
+ + + Redo the bad merge by merging the two backout heads, + which reduces the number of heads in the repository to + two, as can be seen in . +
+ Merging the backouts + + + XXX add text + +
+
+ + + Merge with the commit that was made after the bad + merge, as shown in . +
+ Merging the backouts + + + XXX add text + +
+
+
+
+ + + Protect yourself from <quote>escaped</quote> + changes + + If you've committed some changes to your local repository + and they've been pushed or pulled somewhere else, this isn't + necessarily a disaster. You can protect yourself ahead of + time against some classes of bad changeset. This is + particularly easy if your team usually pulls changes from a + central repository. + + By configuring some hooks on that repository to validate + incoming changesets (see chapter ), + you can + automatically prevent some kinds of bad changeset from being + pushed to the central repository at all. With such a + configuration in place, some kinds of bad changeset will + naturally tend to die out because they can't + propagate into the central repository. Better yet, this + happens without any need for explicit intervention. + + For instance, an incoming change hook that + verifies that a changeset will actually compile can prevent + people from inadvertently breaking the + build. + + + + What to do about sensitive changes that escape + + Even a carefully run project can suffer an unfortunate + event such as the committing and uncontrolled propagation of a + file that contains important passwords. + + If something like this happens to you, and the information + that gets accidentally propagated is truly sensitive, your + first step should be to mitigate the effect of the leak + without trying to control the leak itself. If you are not 100% + certain that you know exactly who could have seen the changes, + you should immediately change passwords, cancel credit cards, + or find some other way to make sure that the information that + has leaked is no longer useful. In other words, assume that + the change has propagated far and wide, and that there's + nothing more you can do. + + You might hope that there would be mechanisms you could + use to either figure out who has seen a change or to erase the + change permanently everywhere, but there are good reasons why + these are not possible. + + Mercurial does not provide an audit trail of who has + pulled changes from a repository, because it is usually either + impossible to record such information or trivial to spoof it. + In a multi-user or networked environment, you should thus be + extremely skeptical of yourself if you think that you have + identified every place that a sensitive changeset has + propagated to. Don't forget that people can and will send + bundles by email, have their backup software save data + offsite, carry repositories on USB sticks, and find other + completely innocent ways to confound your attempts to track + down every copy of a problematic change. + + Mercurial also does not provide a way to make a file or + changeset completely disappear from history, because there is + no way to enforce its disappearance; someone could easily + modify their copy of Mercurial to ignore such directives. In + addition, even if Mercurial provided such a capability, + someone who simply hadn't pulled a make this file + disappear changeset wouldn't be affected by it, nor + would web crawlers visiting at the wrong time, disk backups, + or other mechanisms. Indeed, no distributed revision control + system can make data reliably vanish. Providing the illusion + of such control could easily give a false sense of security, + and be worse than not providing it at all. + +
+ + + Finding the source of a bug + + While it's all very well to be able to back out a changeset + that introduced a bug, this requires that you know which + changeset to back out. Mercurial provides an invaluable + command, called hg bisect, that + helps you to automate this process and accomplish it very + efficiently. + + The idea behind the hg + bisect command is that a changeset has introduced + some change of behavior that you can identify with a simple + pass/fail test. You don't know which piece of code introduced the + change, but you know how to test for the presence of the bug. + The hg bisect command uses your + test to direct its search for the changeset that introduced the + code that caused the bug. + + Here are a few scenarios to help you understand how you + might apply this command. + + The most recent version of your software has a + bug that you remember wasn't present a few weeks ago, but + you don't know when it was introduced. Here, your binary + test checks for the presence of that bug. + + You fixed a bug in a rush, and now it's time to + close the entry in your team's bug database. The bug + database requires a changeset ID when you close an entry, + but you don't remember which changeset you fixed the bug in. + Once again, your binary test checks for the presence of the + bug. + + Your software works correctly, but runs 15% + slower than the last time you measured it. You want to know + which changeset introduced the performance regression. In + this case, your binary test measures the performance of your + software, to see whether it's fast or + slow. + + The sizes of the components of your project that + you ship exploded recently, and you suspect that something + changed in the way you build your project. + + + From these examples, it should be clear that the hg bisect command is not useful only + for finding the sources of bugs. You can use it to find any + emergent property of a repository (anything that + you can't find from a simple text search of the files in the + tree) for which you can write a binary test. + + We'll introduce a little bit of terminology here, just to + make it clear which parts of the search process are your + responsibility, and which are Mercurial's. A + test is something that + you run when hg + bisect chooses a changeset. A + probe is what hg + bisect runs to tell whether a revision is good. + Finally, we'll use the word bisect, as both a + noun and a verb, to stand in for the phrase search using + the hg bisect + command. + + One simple way to automate the searching process would be + simply to probe every changeset. However, this scales poorly. + If it took ten minutes to test a single changeset, and you had + 10,000 changesets in your repository, the exhaustive approach + would take on average 35 days to find the + changeset that introduced a bug. Even if you knew that the bug + was introduced by one of the last 500 changesets, and limited + your search to those, you'd still be looking at over 40 hours to + find the changeset that introduced your bug. + + What the hg bisect command + does is use its knowledge of the shape of your + project's revision history to perform a search in time + proportional to the logarithm of the number + of changesets to check (the kind of search it performs is called + a dichotomic search). With this approach, searching through + 10,000 changesets will take less than three hours, even at ten + minutes per test (the search will require about 14 tests). + Limit your search to the last hundred changesets, and it will + take only about an hour (roughly seven tests). + + The hg bisect command is + aware of the branchy nature of a Mercurial + project's revision history, so it has no problems dealing with + branches, merges, or multiple heads in a repository. It can + prune entire branches of history with a single probe, which is + how it operates so efficiently. + + + Using the <command role="hg-cmd">hg bisect</command> + command + + Here's an example of hg + bisect in action. + + + In versions 0.9.5 and earlier of Mercurial, hg bisect was not a core command: + it was distributed with Mercurial as an extension. This + section describes the built-in command, not the old + extension. + + + Now let's create a repository, so that we can try out the + hg bisect command in + isolation. + + &interaction.bisect.init; + + We'll simulate a project that has a bug in it in a + simple-minded way: create trivial changes in a loop, and + nominate one specific change that will have the + bug. This loop creates 35 changesets, each + adding a single file to the repository. We'll represent our + bug with a file that contains the text i + have a gub. + + &interaction.bisect.commits; + + The next thing that we'd like to do is figure out how to + use the hg bisect command. + We can use Mercurial's normal built-in help mechanism for + this. + + &interaction.bisect.help; + + The hg bisect command + works in steps. Each step proceeds as follows. + + You run your binary test. + + If the test succeeded, you tell hg bisect by running the + hg bisect --good + command. + + If it failed, run the hg bisect --bad + command. + + The command uses your information to decide + which changeset to test next. + + It updates the working directory to that + changeset, and the process begins again. + + The process ends when hg + bisect identifies a unique changeset that marks + the point where your test transitioned from + succeeding to failing. + + To start the search, we must run the hg bisect --reset command. + + &interaction.bisect.search.init; + + In our case, the binary test we use is simple: we check to + see if any file in the repository contains the string i + have a gub. If it does, this changeset contains the + change that caused the bug. By convention, a + changeset that has the property we're searching for is + bad, while one that doesn't is + good. + + Most of the time, the revision to which the working + directory is synced (usually the tip) already exhibits the + problem introduced by the buggy change, so we'll mark it as + bad. + + &interaction.bisect.search.bad-init; + + Our next task is to nominate a changeset that we know + doesn't have the bug; the hg bisect command will + bracket its search between the first pair of + good and bad changesets. In our case, we know that revision + 10 didn't have the bug. (I'll have more words about choosing + the first good changeset later.) + + &interaction.bisect.search.good-init; + + Notice that this command printed some output. + + It told us how many changesets it must + consider before it can identify the one that introduced + the bug, and how many tests that will require. + + It updated the working directory to the next + changeset to test, and told us which changeset it's + testing. + + + We now run our test in the working directory. We use the + grep command to see if our + bad file is present in the working directory. + If it is, this revision is bad; if not, this revision is good. + &interaction.bisect.search.step1; + + This test looks like a perfect candidate for automation, + so let's turn it into a shell function. + &interaction.bisect.search.mytest; + + We can now run an entire test step with a single command, + mytest. + + &interaction.bisect.search.step2; + + A few more invocations of our canned test step command, + and we're done. + + &interaction.bisect.search.rest; + + Even though we had 40 changesets to search through, the + hg bisect command let us find + the changeset that introduced our bug with only + five tests. Because the number of tests that the hg bisect command performs grows + logarithmically with the number of changesets to search, the + advantage that it has over the brute force + search approach increases with every changeset you add. + + + + Cleaning up after your search + + When you're finished using the hg + bisect command in a repository, you can use the + hg bisect --reset command to + drop the information it was using to drive your search. The + command doesn't use much space, so it doesn't matter if you + forget to run this command. However, hg bisect won't let you start a new + search in that repository until you do a hg bisect --reset. + + &interaction.bisect.search.reset; + + + + + Tips for finding bugs effectively + + + Give consistent input + + The hg bisect command + requires that you correctly report the result of every test + you perform. If you tell it that a test failed when it really + succeeded, it might be able to detect the + inconsistency. If it can identify an inconsistency in your + reports, it will tell you that a particular changeset is both + good and bad. However, it can't do this perfectly; it's about + as likely to report the wrong changeset as the source of the + bug. + + + + Automate as much as possible + + When I started using the hg + bisect command, I tried a few times to run my + tests by hand, on the command line. This is an approach that + I, at least, am not suited to. After a few tries, I found + that I was making enough mistakes that I was having to restart + my searches several times before finally getting correct + results. + + My initial problems with driving the hg bisect command by hand occurred + even with simple searches on small repositories; if the + problem you're looking for is more subtle, or the number of + tests that hg bisect must + perform increases, the likelihood of operator error ruining + the search is much higher. Once I started automating my + tests, I had much better results. + + The key to automated testing is twofold: + + always test for the same symptom, and + + always feed consistent input to the hg bisect command. + + In my tutorial example above, the grep + command tests for the symptom, and the if + statement takes the result of this check and ensures that we + always feed the same input to the hg + bisect command. The mytest + function marries these together in a reproducible way, so that + every test is uniform and consistent. + + + + Check your results + + Because the output of a hg + bisect search is only as good as the input you + give it, don't take the changeset it reports as the absolute + truth. A simple way to cross-check its report is to manually + run your test at each of the following changesets: + + The changeset that it reports as the first bad + revision. Your test should still report this as + bad. + + The parent of that changeset (either parent, + if it's a merge). Your test should report this changeset + as good. + + A child of that changeset. Your test should + report this changeset as bad. + + + + + Beware interference between bugs + + It's possible that your search for one bug could be + disrupted by the presence of another. For example, let's say + your software crashes at revision 100, and worked correctly at + revision 50. Unknown to you, someone else introduced a + different crashing bug at revision 60, and fixed it at + revision 80. This could distort your results in one of + several ways. + + It is possible that this other bug completely + masks yours, which is to say that it occurs + before your bug has a chance to manifest itself. If you can't + avoid that other bug (for example, it prevents your project + from building), and so can't tell whether your bug is present + in a particular changeset, the hg + bisect command cannot help you directly. Instead, + you can mark a changeset as untested by running hg bisect --skip. + + A different problem could arise if your test for a bug's + presence is not specific enough. If you check for my + program crashes, then both your crashing bug and an + unrelated crashing bug that masks it will look like the same + thing, and mislead hg + bisect. + + Another useful situation in which to use hg bisect --skip is if you can't + test a revision because your project was in a broken and hence + untestable state at that revision, perhaps because someone + checked in a change that prevented the project from + building. + + + + Bracket your search lazily + + Choosing the first good and + bad changesets that will mark the end points of + your search is often easy, but it bears a little discussion + nevertheless. From the perspective of hg bisect, the newest + changeset is conventionally bad, and the older + changeset is good. + + If you're having trouble remembering when a suitable + good change was, so that you can tell hg bisect, you could do worse than + testing changesets at random. Just remember to eliminate + contenders that can't possibly exhibit the bug (perhaps + because the feature with the bug isn't present yet) and those + where another problem masks the bug (as I discussed + above). + + Even if you end up early by thousands of + changesets or months of history, you will only add a handful + of tests to the total number that hg + bisect must perform, thanks to its logarithmic + behavior. + + + +
+ + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch10-hook.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch10-hook.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,1928 @@ + + + + + Handling repository events with hooks + + Mercurial offers a powerful mechanism to let you perform + automated actions in response to events that occur in a + repository. In some cases, you can even control Mercurial's + response to those events. + + The name Mercurial uses for one of these actions is a + hook. Hooks are called + triggers in some revision control systems, but the + two names refer to the same idea. + + + An overview of hooks in Mercurial + + Here is a brief list of the hooks that Mercurial + supports. We will revisit each of these hooks in more detail + later, in . + + Each of the hooks whose description begins with the word + Controlling has the ability to determine whether + an activity can proceed. If the hook succeeds, the activity may + proceed; if it fails, the activity is either not permitted or + undone, depending on the hook. + + + changegroup: This + is run after a group of changesets has been brought into the + repository from elsewhere. + + commit: This is + run after a new changeset has been created in the local + repository. + + incoming: This is + run once for each new changeset that is brought into the + repository from elsewhere. Notice the difference from + changegroup, which is run + once per group of changesets brought + in. + + outgoing: This is + run after a group of changesets has been transmitted from + this repository. + + prechangegroup: + This is run before starting to bring a group of changesets + into the repository. + + + precommit: + Controlling. This is run before starting a commit. + + + preoutgoing: + Controlling. This is run before starting to transmit a group + of changesets from this repository. + + + pretag: + Controlling. This is run before creating a tag. + + + pretxnchangegroup: Controlling. This + is run after a group of changesets has been brought into the + local repository from another, but before the transaction + completes that will make the changes permanent in the + repository. + + + pretxncommit: + Controlling. This is run after a new changeset has been + created in the local repository, but before the transaction + completes that will make it permanent. + + + preupdate: + Controlling. This is run before starting an update or merge + of the working directory. + + + tag: This is run + after a tag is created. + + + update: This is + run after an update or merge of the working directory has + finished. + + + + + + Hooks and security + + + Hooks are run with your privileges + + When you run a Mercurial command in a repository, and the + command causes a hook to run, that hook runs on + your system, under + your user account, with + your privilege level. Since hooks are + arbitrary pieces of executable code, you should treat them + with an appropriate level of suspicion. Do not install a hook + unless you are confident that you know who created it and what + it does. + + + In some cases, you may be exposed to hooks that you did + not install yourself. If you work with Mercurial on an + unfamiliar system, Mercurial will run hooks defined in that + system's global ~/.hgrc + file. + + + If you are working with a repository owned by another + user, Mercurial can run hooks defined in that user's + repository, but it will still run them as you. + For example, if you hg pull + from that repository, and its .hg/hgrc defines a local outgoing hook, that hook will run + under your user account, even though you don't own that + repository. + + + + This only applies if you are pulling from a repository + on a local or network filesystem. If you're pulling over + http or ssh, any outgoing + hook will run under whatever account is executing the server + process, on the server. + + + + To see what hooks are defined in a repository, + use the hg showconfig hooks + command. If you are working in one repository, but talking to + another that you do not own (e.g. using hg pull or hg + incoming), remember that it is the other + repository's hooks you should be checking, not your own. + + + + + Hooks do not propagate + + In Mercurial, hooks are not revision controlled, and do + not propagate when you clone, or pull from, a repository. The + reason for this is simple: a hook is a completely arbitrary + piece of executable code. It runs under your user identity, + with your privilege level, on your machine. + + + It would be extremely reckless for any distributed + revision control system to implement revision-controlled + hooks, as this would offer an easily exploitable way to + subvert the accounts of users of the revision control system. + + + Since Mercurial does not propagate hooks, if you are + collaborating with other people on a common project, you + should not assume that they are using the same Mercurial hooks + as you are, or that theirs are correctly configured. You + should document the hooks you expect people to use. + + + In a corporate intranet, this is somewhat easier to + control, as you can for example provide a + standard installation of Mercurial on an NFS + filesystem, and use a site-wide ~/.hgrc file to define hooks that all users will + see. However, this too has its limits; see below. + + + + + Hooks can be overridden + + Mercurial allows you to override a hook definition by + redefining the hook. You can disable it by setting its value + to the empty string, or change its behavior as you wish. + + + If you deploy a system- or site-wide ~/.hgrc file that defines some + hooks, you should thus understand that your users can disable + or override those hooks. + + + + + Ensuring that critical hooks are run + + Sometimes you may want to enforce a policy that you do not + want others to be able to work around. For example, you may + have a requirement that every changeset must pass a rigorous + set of tests. Defining this requirement via a hook in a + site-wide ~/.hgrc won't + work for remote users on laptops, and of course local users + can subvert it at will by overriding the hook. + + + Instead, you can set up your policies for use of Mercurial + so that people are expected to propagate changes through a + well-known canonical server that you have + locked down and configured appropriately. + + + One way to do this is via a combination of social + engineering and technology. Set up a restricted-access + account; users can push changes over the network to + repositories managed by this account, but they cannot log into + the account and run normal shell commands. In this scenario, + a user can commit a changeset that contains any old garbage + they want. + + + When someone pushes a changeset to the server that + everyone pulls from, the server will test the changeset before + it accepts it as permanent, and reject it if it fails to pass + the test suite. If people only pull changes from this + filtering server, it will serve to ensure that all changes + that people pull have been automatically vetted. + + + + + + + A short tutorial on using hooks + + It is easy to write a Mercurial hook. Let's start with a + hook that runs when you finish a hg + commit, and simply prints the hash of the changeset + you just created. The hook is called commit. + + + All hooks follow the pattern in this example. + +&interaction.hook.simple.init; + + You add an entry to the hooks section of your ~/.hgrc. On the left is the name of + the event to trigger on; on the right is the action to take. As + you can see, you can run an arbitrary shell command in a hook. + Mercurial passes extra information to the hook using environment + variables (look for HG_NODE in the example). + + + + Performing multiple actions per event + + Quite often, you will want to define more than one hook + for a particular kind of event, as shown below. + +&interaction.hook.simple.ext; + + Mercurial lets you do this by adding an + extension to the end of a hook's name. + You extend a hook's name by giving the name of the hook, + followed by a full stop (the + . character), followed by + some more text of your choosing. For example, Mercurial will + run both commit.foo and + commit.bar when the + commit event occurs. + + + To give a well-defined order of execution when there are + multiple hooks defined for an event, Mercurial sorts hooks by + extension, and executes the hook commands in this sorted + order. In the above example, it will execute + commit.bar before + commit.foo, and commit + before both. + + + It is a good idea to use a somewhat descriptive + extension when you define a new hook. This will help you to + remember what the hook was for. If the hook fails, you'll get + an error message that contains the hook name and extension, so + using a descriptive extension could give you an immediate hint + as to why the hook failed (see for an example). + + + + + Controlling whether an activity can proceed + + In our earlier examples, we used the commit hook, which is run after a + commit has completed. This is one of several Mercurial hooks + that run after an activity finishes. Such hooks have no way + of influencing the activity itself. + + + Mercurial defines a number of events that occur before an + activity starts; or after it starts, but before it finishes. + Hooks that trigger on these events have the added ability to + choose whether the activity can continue, or will abort. + + + The pretxncommit hook runs + after a commit has all but completed. In other words, the + metadata representing the changeset has been written out to + disk, but the transaction has not yet been allowed to + complete. The pretxncommit + hook has the ability to decide whether the transaction can + complete, or must be rolled back. + + + If the pretxncommit hook + exits with a status code of zero, the transaction is allowed + to complete; the commit finishes; and the commit hook is run. If the pretxncommit hook exits with a + non-zero status code, the transaction is rolled back; the + metadata representing the changeset is erased; and the + commit hook is not run. + + +&interaction.hook.simple.pretxncommit; + + The hook in the example above checks that a commit comment + contains a bug ID. If it does, the commit can complete. If + not, the commit is rolled back. + + + + + + Writing your own hooks + + When you are writing a hook, you might find it useful to run + Mercurial either with the option, or the verbose config item set to + true. When you do so, Mercurial will print a + message before it calls each hook. + + + + Choosing how your hook should run + + You can write a hook either as a normal + program&emdash;typically a shell script&emdash;or as a Python + function that is executed within the Mercurial process. + + + Writing a hook as an external program has the advantage + that it requires no knowledge of Mercurial's internals. You + can call normal Mercurial commands to get any added + information you need. The trade-off is that external hooks + are slower than in-process hooks. + + + An in-process Python hook has complete access to the + Mercurial API, and does not shell out to + another process, so it is inherently faster than an external + hook. It is also easier to obtain much of the information + that a hook requires by using the Mercurial API than by + running Mercurial commands. + + + If you are comfortable with Python, or require high + performance, writing your hooks in Python may be a good + choice. However, when you have a straightforward hook to + write and you don't need to care about performance (probably + the majority of hooks), a shell script is perfectly fine. + + + + + Hook parameters + + Mercurial calls each hook with a set of well-defined + parameters. In Python, a parameter is passed as a keyword + argument to your hook function. For an external program, a + parameter is passed as an environment variable. + + + Whether your hook is written in Python or as a shell + script, the hook-specific parameter names and values will be + the same. A boolean parameter will be represented as a + boolean value in Python, but as the number 1 (for + true) or 0 (for false) as an + environment variable for an external hook. If a hook + parameter is named foo, the keyword + argument for a Python hook will also be named + foo, while the environment variable for an + external hook will be named HG_FOO. + + + + + Hook return values and activity control + + A hook that executes successfully must exit with a status + of zero if external, or return boolean false if + in-process. Failure is indicated with a non-zero exit status + from an external hook, or an in-process hook returning boolean + true. If an in-process hook raises an + exception, the hook is considered to have failed. + + + For a hook that controls whether an activity can proceed, + zero/false means allow, while + non-zero/true/exception means deny. + + + + + Writing an external hook + + When you define an external hook in your ~/.hgrc and the hook is run, its + value is passed to your shell, which interprets it. This + means that you can use normal shell constructs in the body of + the hook. + + + An executable hook is always run with its current + directory set to a repository's root directory. + + + Each hook parameter is passed in as an environment + variable; the name is upper-cased, and prefixed with the + string HG_. + + + With the exception of hook parameters, Mercurial does not + set or modify any environment variables when running a hook. + This is useful to remember if you are writing a site-wide hook + that may be run by a number of different users with differing + environment variables set. In multi-user situations, you + should not rely on environment variables being set to the + values you have in your environment when testing the hook. + + + + + Telling Mercurial to use an in-process hook + + The ~/.hgrc syntax + for defining an in-process hook is slightly different than for + an executable hook. The value of the hook must start with the + text python:, and continue + with the fully-qualified name of a callable object to use as + the hook's value. + + + The module in which a hook lives is automatically imported + when a hook is run. So long as you have the module name and + PYTHONPATH right, it should just + work. + + + The following ~/.hgrc + example snippet illustrates the syntax and meaning of the + notions we just described. + + [hooks] +commit.example = python:mymodule.submodule.myhook + When Mercurial runs the commit.example + hook, it imports mymodule.submodule, looks + for the callable object named myhook, and + calls it. + + + + + Writing an in-process hook + + The simplest in-process hook does nothing, but illustrates + the basic shape of the hook API: + + def myhook(ui, repo, **kwargs): + pass + The first argument to a Python hook is always a ui object. The second + is a repository object; at the moment, it is always an + instance of localrepository. + Following these two arguments are other keyword arguments. + Which ones are passed in depends on the hook being called, but + a hook can ignore arguments it doesn't care about by dropping + them into a keyword argument dict, as with + **kwargs above. + + + + + + Some hook examples + + + Writing meaningful commit messages + + It's hard to imagine a useful commit message being very + short. The simple pretxncommit + hook of the example below will prevent you from committing a + changeset with a message that is less than ten bytes long. + + +&interaction.hook.msglen.go; + + + + Checking for trailing whitespace + + An interesting use of a commit-related hook is to help you + to write cleaner code. A simple example of cleaner + code is the dictum that a change should not add any + new lines of text that contain trailing + whitespace. Trailing whitespace is a series of + space and tab characters at the end of a line of text. In + most cases, trailing whitespace is unnecessary, invisible + noise, but it is occasionally problematic, and people often + prefer to get rid of it. + + + You can use either the precommit or pretxncommit hook to tell whether you + have a trailing whitespace problem. If you use the precommit hook, the hook will not know + which files you are committing, so it will have to check every + modified file in the repository for trailing white space. If + you want to commit a change to just the file + foo, but the file + bar contains trailing whitespace, doing a + check in the precommit hook + will prevent you from committing foo due + to the problem with bar. This doesn't + seem right. + + + Should you choose the pretxncommit hook, the check won't + occur until just before the transaction for the commit + completes. This will allow you to check for problems only the + exact files that are being committed. However, if you entered + the commit message interactively and the hook fails, the + transaction will roll back; you'll have to re-enter the commit + message after you fix the trailing whitespace and run hg commit again. + + + &interaction.ch09-hook.ws.simple; + + In this example, we introduce a simple pretxncommit hook that checks for + trailing whitespace. This hook is short, but not very + helpful. It exits with an error status if a change adds a + line with trailing whitespace to any file, but does not print + any information that might help us to identify the offending + file or line. It also has the nice property of not paying + attention to unmodified lines; only lines that introduce new + trailing whitespace cause problems. + + + &ch09-check_whitespace.py.lst; + + The above version is much more complex, but also more + useful. It parses a unified diff to see if any lines add + trailing whitespace, and prints the name of the file and the + line number of each such occurrence. Even better, if the + change adds trailing whitespace, this hook saves the commit + comment and prints the name of the save file before exiting + and telling Mercurial to roll the transaction back, so you can + use the + option to hg commit to reuse + the saved commit message once you've corrected the problem. + + + &interaction.ch09-hook.ws.better; + + As a final aside, note in the example above the + use of sed's in-place editing feature to + get rid of trailing whitespace from a file. This is concise + and useful enough that I will reproduce it here (using + perl for good measure). + perl -pi -e 's,\s+$,,' filename + + + + + Bundled hooks + + Mercurial ships with several bundled hooks. You can find + them in the hgext + directory of a Mercurial source tree. If you are using a + Mercurial binary package, the hooks will be located in the + hgext directory of + wherever your package installer put Mercurial. + + + + <literal role="hg-ext">acl</literal>&emdash;access + control for parts of a repository + + The acl extension lets + you control which remote users are allowed to push changesets + to a networked server. You can protect any portion of a + repository (including the entire repo), so that a specific + remote user can push changes that do not affect the protected + portion. + + + This extension implements access control based on the + identity of the user performing a push, + not on who committed the changesets + they're pushing. It makes sense to use this hook only if you + have a locked-down server environment that authenticates + remote users, and you want to be sure that only specific users + are allowed to push changes to that server. + + + + Configuring the <literal role="hook">acl</literal> + hook + + In order to manage incoming changesets, the acl hook must be used as a + pretxnchangegroup hook. This + lets it see which files are modified by each incoming + changeset, and roll back a group of changesets if they + modify forbidden files. Example: + + [hooks] +pretxnchangegroup.acl = python:hgext.acl.hook + + The acl extension is + configured using three sections. + + + The acl section has + only one entry, sources, + which lists the sources of incoming changesets that the hook + should pay attention to. You don't normally need to + configure this section. + + + serve: + Control incoming changesets that are arriving from a + remote repository over http or ssh. This is the default + value of sources, and + usually the only setting you'll need for this + configuration item. + + + pull: + Control incoming changesets that are arriving via a pull + from a local repository. + + + push: + Control incoming changesets that are arriving via a push + from a local repository. + + + bundle: + Control incoming changesets that are arriving from + another repository via a bundle. + + + + The acl.allow + section controls the users that are allowed to add + changesets to the repository. If this section is not + present, all users that are not explicitly denied are + allowed. If this section is present, all users that are not + explicitly allowed are denied (so an empty section means + that all users are denied). + + + The acl.deny + section determines which users are denied from adding + changesets to the repository. If this section is not + present or is empty, no users are denied. + + + The syntaxes for the acl.allow and acl.deny sections are + identical. On the left of each entry is a glob pattern that + matches files or directories, relative to the root of the + repository; on the right, a user name. + + + In the following example, the user + docwriter can only push changes to the + docs subtree of the + repository, while intern can push changes + to any file or directory except source/sensitive. + + [acl.allow] +docs/** = docwriter +[acl.deny] +source/sensitive/** = intern + + + + Testing and troubleshooting + + If you want to test the acl hook, run it with Mercurial's + debugging output enabled. Since you'll probably be running + it on a server where it's not convenient (or sometimes + possible) to pass in the option, don't forget + that you can enable debugging output in your ~/.hgrc: + + [ui] +debug = true + With this enabled, the acl hook will print enough + information to let you figure out why it is allowing or + forbidding pushes from specific users. + + + + + + <literal + role="hg-ext">bugzilla</literal>&emdash;integration with + Bugzilla + + The bugzilla extension + adds a comment to a Bugzilla bug whenever it finds a reference + to that bug ID in a commit comment. You can install this hook + on a shared server, so that any time a remote user pushes + changes to this server, the hook gets run. + + + It adds a comment to the bug that looks like this (you can + configure the contents of the comment&emdash;see below): + + Changeset aad8b264143a, made by Joe User + <joe.user@domain.com> in the frobnitz repository, refers + to this bug. For complete details, see + http://hg.domain.com/frobnitz?cmd=changeset;node=aad8b264143a + Changeset description: Fix bug 10483 by guarding against some + NULL pointers + The value of this hook is that it automates the process of + updating a bug any time a changeset refers to it. If you + configure the hook properly, it makes it easy for people to + browse straight from a Bugzilla bug to a changeset that refers + to that bug. + + + You can use the code in this hook as a starting point for + some more exotic Bugzilla integration recipes. Here are a few + possibilities: + + + Require that every changeset pushed to the + server have a valid bug ID in its commit comment. In this + case, you'd want to configure the hook as a pretxncommit hook. This would + allow the hook to reject changes that didn't contain bug + IDs. + + + Allow incoming changesets to automatically + modify the state of a bug, as well as + simply adding a comment. For example, the hook could + recognise the string fixed bug 31337 as + indicating that it should update the state of bug 31337 to + requires testing. + + + + + Configuring the <literal role="hook">bugzilla</literal> + hook + + You should configure this hook in your server's + ~/.hgrc as an incoming hook, for example as + follows: + + [hooks] +incoming.bugzilla = python:hgext.bugzilla.hook + + Because of the specialised nature of this hook, and + because Bugzilla was not written with this kind of + integration in mind, configuring this hook is a somewhat + involved process. + + + Before you begin, you must install the MySQL bindings + for Python on the host(s) where you'll be running the hook. + If this is not available as a binary package for your + system, you can download it from + web:mysql-python. + + + Configuration information for this hook lives in the + bugzilla section of + your ~/.hgrc. + + + version: The version + of Bugzilla installed on the server. The database + schema that Bugzilla uses changes occasionally, so this + hook has to know exactly which schema to use. + + host: + The hostname of the MySQL server that stores your + Bugzilla data. The database must be configured to allow + connections from whatever host you are running the + bugzilla hook on. + + + user: + The username with which to connect to the MySQL server. + The database must be configured to allow this user to + connect from whatever host you are running the bugzilla hook on. This user + must be able to access and modify Bugzilla tables. The + default value of this item is bugs, + which is the standard name of the Bugzilla user in a + MySQL database. + + + password: The MySQL + password for the user you configured above. This is + stored as plain text, so you should make sure that + unauthorised users cannot read the ~/.hgrc file where you + store this information. + + + db: + The name of the Bugzilla database on the MySQL server. + The default value of this item is + bugs, which is the standard name of + the MySQL database where Bugzilla stores its data. + + + notify: If you want + Bugzilla to send out a notification email to subscribers + after this hook has added a comment to a bug, you will + need this hook to run a command whenever it updates the + database. The command to run depends on where you have + installed Bugzilla, but it will typically look something + like this, if you have Bugzilla installed in /var/www/html/bugzilla: + + cd /var/www/html/bugzilla && + ./processmail %s nobody@nowhere.com + + The Bugzilla + processmail program expects to be + given a bug ID (the hook replaces + %s with the bug ID) + and an email address. It also expects to be able to + write to some files in the directory that it runs in. + If Bugzilla and this hook are not installed on the same + machine, you will need to find a way to run + processmail on the server where + Bugzilla is installed. + + + + + + Mapping committer names to Bugzilla user names + + By default, the bugzilla hook tries to use the + email address of a changeset's committer as the Bugzilla + user name with which to update a bug. If this does not suit + your needs, you can map committer email addresses to + Bugzilla user names using a usermap section. + + + Each item in the usermap section contains an + email address on the left, and a Bugzilla user name on the + right. + + [usermap] +jane.user@example.com = jane + You can either keep the usermap data in a normal + ~/.hgrc, or tell the + bugzilla hook to read the + information from an external usermap + file. In the latter case, you can store + usermap data by itself in (for example) + a user-modifiable repository. This makes it possible to let + your users maintain their own usermap entries. The main + ~/.hgrc file might look + like this: + + # regular hgrc file refers to external usermap file +[bugzilla] +usermap = /home/hg/repos/userdata/bugzilla-usermap.conf + While the usermap file that it + refers to might look like this: + + # bugzilla-usermap.conf - inside a hg repository +[usermap] stephanie@example.com = steph + + + + Configuring the text that gets added to a bug + + You can configure the text that this hook adds as a + comment; you specify it in the form of a Mercurial template. + Several ~/.hgrc entries + (still in the bugzilla + section) control this behavior. + + + strip: The number of + leading path elements to strip from a repository's path + name to construct a partial path for a URL. For example, + if the repositories on your server live under /home/hg/repos, and you + have a repository whose path is /home/hg/repos/app/tests, + then setting strip to + 4 will give a partial path of + app/tests. The + hook will make this partial path available when + expanding a template, as webroot. + + + template: The text of the + template to use. In addition to the usual + changeset-related variables, this template can use + hgweb (the value of the + hgweb configuration item above) and + webroot (the path constructed using + strip above). + + + + In addition, you can add a baseurl item to the web section of your ~/.hgrc. The bugzilla hook will make this + available when expanding a template, as the base string to + use when constructing a URL that will let users browse from + a Bugzilla comment to view a changeset. Example: + + [web] +baseurl = http://hg.domain.com/ + + Here is an example set of bugzilla hook config information. + + + &ch10-bugzilla-config.lst; + + + + Testing and troubleshooting + + The most common problems with configuring the bugzilla hook relate to running + Bugzilla's processmail script and + mapping committer names to user names. + + + Recall from above that the user + that runs the Mercurial process on the server is also the + one that will run the processmail + script. The processmail script + sometimes causes Bugzilla to write to files in its + configuration directory, and Bugzilla's configuration files + are usually owned by the user that your web server runs + under. + + + You can cause processmail to be run + with the suitable user's identity using the + sudo command. Here is an example entry + for a sudoers file. + + hg_user = (httpd_user) +NOPASSWD: /var/www/html/bugzilla/processmail-wrapper %s + This allows the hg_user user to run a + processmail-wrapper program under the + identity of httpd_user. + + + This indirection through a wrapper script is necessary, + because processmail expects to be run + with its current directory set to wherever you installed + Bugzilla; you can't specify that kind of constraint in a + sudoers file. The contents of the + wrapper script are simple: + + #!/bin/sh +cd `dirname $0` && ./processmail "$1" nobody@example.com + It doesn't seem to matter what email address you pass to + processmail. + + + If your usermap is + not set up correctly, users will see an error message from + the bugzilla hook when they + push changes to the server. The error message will look + like this: + + cannot find bugzilla user id for john.q.public@example.com + What this means is that the committer's address, + john.q.public@example.com, is not a valid + Bugzilla user name, nor does it have an entry in your + usermap that maps it to + a valid Bugzilla user name. + + + + + + <literal role="hg-ext">notify</literal>&emdash;send email + notifications + + Although Mercurial's built-in web server provides RSS + feeds of changes in every repository, many people prefer to + receive change notifications via email. The notify hook lets you send out + notifications to a set of email addresses whenever changesets + arrive that those subscribers are interested in. + + + As with the bugzilla + hook, the notify hook is + template-driven, so you can customise the contents of the + notification messages that it sends. + + + By default, the notify + hook includes a diff of every changeset that it sends out; you + can limit the size of the diff, or turn this feature off + entirely. It is useful for letting subscribers review changes + immediately, rather than clicking to follow a URL. + + + + Configuring the <literal role="hg-ext">notify</literal> + hook + + You can set up the notify hook to send one email + message per incoming changeset, or one per incoming group of + changesets (all those that arrived in a single pull or + push). + + [hooks] +# send one email per group of changes +changegroup.notify = python:hgext.notify.hook +# send one email per change +incoming.notify = python:hgext.notify.hook + + Configuration information for this hook lives in the + notify section of a + ~/.hgrc file. + + + test: + By default, this hook does not send out email at all; + instead, it prints the message that it + would send. Set this item to + false to allow email to be sent. The + reason that sending of email is turned off by default is + that it takes several tries to configure this extension + exactly as you would like, and it would be bad form to + spam subscribers with a number of broken + notifications while you debug your configuration. + + + config: + The path to a configuration file that contains + subscription information. This is kept separate from + the main ~/.hgrc so + that you can maintain it in a repository of its own. + People can then clone that repository, update their + subscriptions, and push the changes back to your server. + + + strip: + The number of leading path separator characters to strip + from a repository's path, when deciding whether a + repository has subscribers. For example, if the + repositories on your server live in /home/hg/repos, and + notify is considering a + repository named /home/hg/repos/shared/test, + setting strip to + 4 will cause notify to trim the path it + considers down to shared/test, and it will + match subscribers against that. + + + template: The template + text to use when sending messages. This specifies both + the contents of the message header and its body. + + + maxdiff: The maximum + number of lines of diff data to append to the end of a + message. If a diff is longer than this, it is + truncated. By default, this is set to 300. Set this to + 0 to omit diffs from notification + emails. + + + sources: A list of + sources of changesets to consider. This lets you limit + notify to only sending + out email about changes that remote users pushed into + this repository via a server, for example. See + for the sources you + can specify here. + + + + If you set the baseurl + item in the web section, + you can use it in a template; it will be available as + webroot. + + + Here is an example set of notify configuration information. + + + &ch10-notify-config.lst; + + This will produce a message that looks like the + following: + + + &ch10-notify-config-mail.lst; + + + + Testing and troubleshooting + + Do not forget that by default, the notify extension will not + send any mail until you explicitly configure it to do so, + by setting test to + false. Until you do that, it simply + prints the message it would send. + + + + + + + Information for writers of hooks + + + In-process hook execution + + An in-process hook is called with arguments of the + following form: + + def myhook(ui, repo, **kwargs): pass + The ui parameter is a ui object. The + repo parameter is a localrepository + object. The names and values of the + **kwargs parameters depend on the hook + being invoked, with the following common features: + + + If a parameter is named + node or parentN, it + will contain a hexadecimal changeset ID. The empty string + is used to represent null changeset ID + instead of a string of zeroes. + + + If a parameter is named + url, it will contain the URL of a + remote repository, if that can be determined. + + + Boolean-valued parameters are represented as + Python bool objects. + + + + An in-process hook is called without a change to the + process's working directory (unlike external hooks, which are + run in the root of the repository). It must not change the + process's working directory, or it will cause any calls it + makes into the Mercurial API to fail. + + + If a hook returns a boolean false value, it + is considered to have succeeded. If it returns a boolean + true value or raises an exception, it is + considered to have failed. A useful way to think of the + calling convention is tell me if you fail. + + + Note that changeset IDs are passed into Python hooks as + hexadecimal strings, not the binary hashes that Mercurial's + APIs normally use. To convert a hash from hex to binary, use + the bin function. + + + + + External hook execution + + An external hook is passed to the shell of the user + running Mercurial. Features of that shell, such as variable + substitution and command redirection, are available. The hook + is run in the root directory of the repository (unlike + in-process hooks, which are run in the same directory that + Mercurial was run in). + + + Hook parameters are passed to the hook as environment + variables. Each environment variable's name is converted in + upper case and prefixed with the string + HG_. For example, if the + name of a parameter is node, + the name of the environment variable representing that + parameter will be HG_NODE. + + + A boolean parameter is represented as the string + 1 for true, + 0 for false. + If an environment variable is named HG_NODE, + HG_PARENT1 or HG_PARENT2, it + contains a changeset ID represented as a hexadecimal string. + The empty string is used to represent null changeset + ID instead of a string of zeroes. If an environment + variable is named HG_URL, it will contain the + URL of a remote repository, if that can be determined. + + + If a hook exits with a status of zero, it is considered to + have succeeded. If it exits with a non-zero status, it is + considered to have failed. + + + + + Finding out where changesets come from + + A hook that involves the transfer of changesets between a + local repository and another may be able to find out + information about the far side. Mercurial + knows how changes are being transferred, + and in many cases where they are being + transferred to or from. + + + + Sources of changesets + + Mercurial will tell a hook what means are, or were, used + to transfer changesets between repositories. This is + provided by Mercurial in a Python parameter named + source, or an environment variable named + HG_SOURCE. + + + + serve: Changesets are + transferred to or from a remote repository over http or + ssh. + + + pull: Changesets are + being transferred via a pull from one repository into + another. + + + push: Changesets are + being transferred via a push from one repository into + another. + + + bundle: Changesets are + being transferred to or from a bundle. + + + + + + Where changes are going&emdash;remote repository + URLs + + When possible, Mercurial will tell a hook the location + of the far side of an activity that transfers + changeset data between repositories. This is provided by + Mercurial in a Python parameter named + url, or an environment variable named + HG_URL. + + + This information is not always known. If a hook is + invoked in a repository that is being served via http or + ssh, Mercurial cannot tell where the remote repository is, + but it may know where the client is connecting from. In + such cases, the URL will take one of the following forms: + + + remote:ssh:1.2.3.4&emdash;remote + ssh client, at the IP address + 1.2.3.4. + + + remote:http:1.2.3.4&emdash;remote + http client, at the IP address + 1.2.3.4. If the client is using SSL, + this will be of the form + remote:https:1.2.3.4. + + + Empty&emdash;no information could be + discovered about the remote client. + + + + + + + Hook reference + + + <literal role="hook">changegroup</literal>&emdash;after + remote changesets added + + This hook is run after a group of pre-existing changesets + has been added to the repository, for example via a hg pull or hg + unbundle. This hook is run once per operation + that added one or more changesets. This is in contrast to the + incoming hook, which is run + once per changeset, regardless of whether the changesets + arrive in a group. + + + Some possible uses for this hook include kicking off an + automated build or test of the added changesets, updating a + bug database, or notifying subscribers that a repository + contains new changes. + + + Parameters to this hook: + + + node: A changeset ID. The + changeset ID of the first changeset in the group that was + added. All changesets between this and + tip, inclusive, were added by a single + hg pull, hg push or hg unbundle. + + + source: A + string. The source of these changes. See for details. + + + url: A URL. The + location of the remote repository, if known. See for more information. + + + + See also: incoming (), prechangegroup (), pretxnchangegroup () + + + + + <literal role="hook">commit</literal>&emdash;after a new + changeset is created + + This hook is run after a new changeset has been created. + + + Parameters to this hook: + + + node: A changeset ID. The + changeset ID of the newly committed changeset. + + + parent1: A changeset ID. + The changeset ID of the first parent of the newly + committed changeset. + + + parent2: A changeset ID. + The changeset ID of the second parent of the newly + committed changeset. + + + + See also: precommit (), pretxncommit () + + + + + <literal role="hook">incoming</literal>&emdash;after one + remote changeset is added + + This hook is run after a pre-existing changeset has been + added to the repository, for example via a hg push. If a group of changesets + was added in a single operation, this hook is called once for + each added changeset. + + + You can use this hook for the same purposes as + the changegroup hook (); it's simply more + convenient sometimes to run a hook once per group of + changesets, while other times it's handier once per changeset. + + + Parameters to this hook: + + + node: A changeset ID. The + ID of the newly added changeset. + + + source: A + string. The source of these changes. See for details. + + + url: A URL. The + location of the remote repository, if known. See for more information. + + + + See also: changegroup () prechangegroup (), pretxnchangegroup () + + + + + <literal role="hook">outgoing</literal>&emdash;after + changesets are propagated + + This hook is run after a group of changesets has been + propagated out of this repository, for example by a hg push or hg + bundle command. + + + One possible use for this hook is to notify administrators + that changes have been pulled. + + + Parameters to this hook: + + + node: A changeset ID. The + changeset ID of the first changeset of the group that was + sent. + + + source: A string. The + source of the of the operation (see ). If a remote + client pulled changes from this repository, + source will be + serve. If the client that obtained + changes from this repository was local, + source will be + bundle, pull, or + push, depending on the operation the + client performed. + + + url: A URL. The + location of the remote repository, if known. See for more information. + + + + See also: preoutgoing () + + + + + <literal + role="hook">prechangegroup</literal>&emdash;before starting + to add remote changesets + + This controlling hook is run before Mercurial begins to + add a group of changesets from another repository. + + + This hook does not have any information about the + changesets to be added, because it is run before transmission + of those changesets is allowed to begin. If this hook fails, + the changesets will not be transmitted. + + + One use for this hook is to prevent external changes from + being added to a repository. For example, you could use this + to freeze a server-hosted branch temporarily or + permanently so that users cannot push to it, while still + allowing a local administrator to modify the repository. + + + Parameters to this hook: + + + source: A string. The + source of these changes. See for details. + + + url: A URL. The + location of the remote repository, if known. See for more information. + + + + See also: changegroup (), incoming (), pretxnchangegroup () + + + + + <literal role="hook">precommit</literal>&emdash;before + starting to commit a changeset + + This hook is run before Mercurial begins to commit a new + changeset. It is run before Mercurial has any of the metadata + for the commit, such as the files to be committed, the commit + message, or the commit date. + + + One use for this hook is to disable the ability to commit + new changesets, while still allowing incoming changesets. + Another is to run a build or test, and only allow the commit + to begin if the build or test succeeds. + + + Parameters to this hook: + + + parent1: A changeset ID. + The changeset ID of the first parent of the working + directory. + + + parent2: A changeset ID. + The changeset ID of the second parent of the working + directory. + + + If the commit proceeds, the parents of the working + directory will become the parents of the new changeset. + + + See also: commit + (), pretxncommit () + + + + + <literal role="hook">preoutgoing</literal>&emdash;before + starting to propagate changesets + + This hook is invoked before Mercurial knows the identities + of the changesets to be transmitted. + + + One use for this hook is to prevent changes from being + transmitted to another repository. + + + Parameters to this hook: + + + source: A + string. The source of the operation that is attempting to + obtain changes from this repository (see ). See the documentation + for the source parameter to the + outgoing hook, in + , for possible values + of this parameter. + + + url: A URL. The + location of the remote repository, if known. See for more information. + + + + See also: outgoing () + + + + + <literal role="hook">pretag</literal>&emdash;before + tagging a changeset + + This controlling hook is run before a tag is created. If + the hook succeeds, creation of the tag proceeds. If the hook + fails, the tag is not created. + + + Parameters to this hook: + + + local: A boolean. Whether + the tag is local to this repository instance (i.e. stored + in .hg/localtags) or + managed by Mercurial (stored in .hgtags). + + + node: A changeset ID. The + ID of the changeset to be tagged. + + + tag: A string. The name of + the tag to be created. + + + + If the tag to be created is + revision-controlled, the precommit and pretxncommit hooks ( and ) will also be run. + + + See also: tag + () + + + + + <literal + role="hook">pretxnchangegroup</literal>&emdash;before + completing addition of remote changesets + + This controlling hook is run before a + transaction&emdash;that manages the addition of a group of new + changesets from outside the repository&emdash;completes. If + the hook succeeds, the transaction completes, and all of the + changesets become permanent within this repository. If the + hook fails, the transaction is rolled back, and the data for + the changesets is erased. + + + This hook can access the metadata associated with the + almost-added changesets, but it should not do anything + permanent with this data. It must also not modify the working + directory. + + + While this hook is running, if other Mercurial processes + access this repository, they will be able to see the + almost-added changesets as if they are permanent. This may + lead to race conditions if you do not take steps to avoid + them. + + + This hook can be used to automatically vet a group of + changesets. If the hook fails, all of the changesets are + rejected when the transaction rolls back. + + + Parameters to this hook: + + + node: A changeset ID. The + changeset ID of the first changeset in the group that was + added. All changesets between this and + tip, + inclusive, were added by a single hg pull, hg push or hg unbundle. + + + source: A + string. The source of these changes. See for details. + + + url: A URL. The + location of the remote repository, if known. See for more information. + + + + See also: changegroup (), incoming (), prechangegroup () + + + + + <literal role="hook">pretxncommit</literal>&emdash;before + completing commit of new changeset + + This controlling hook is run before a + transaction&emdash;that manages a new commit&emdash;completes. + If the hook succeeds, the transaction completes and the + changeset becomes permanent within this repository. If the + hook fails, the transaction is rolled back, and the commit + data is erased. + + + This hook can access the metadata associated with the + almost-new changeset, but it should not do anything permanent + with this data. It must also not modify the working + directory. + + + While this hook is running, if other Mercurial processes + access this repository, they will be able to see the + almost-new changeset as if it is permanent. This may lead to + race conditions if you do not take steps to avoid them. + + + Parameters to this hook: + + + node: A changeset ID. The + changeset ID of the newly committed changeset. + + + parent1: A changeset ID. + The changeset ID of the first parent of the newly + committed changeset. + + + parent2: A changeset ID. + The changeset ID of the second parent of the newly + committed changeset. + + + + See also: precommit () + + + + + <literal role="hook">preupdate</literal>&emdash;before + updating or merging working directory + + This controlling hook is run before an update + or merge of the working directory begins. It is run only if + Mercurial's normal pre-update checks determine that the update + or merge can proceed. If the hook succeeds, the update or + merge may proceed; if it fails, the update or merge does not + start. + + + Parameters to this hook: + + + parent1: A + changeset ID. The ID of the parent that the working + directory is to be updated to. If the working directory + is being merged, it will not change this parent. + + + parent2: A + changeset ID. Only set if the working directory is being + merged. The ID of the revision that the working directory + is being merged with. + + + + See also: update + () + + + + <literal role="hook">tag</literal>&emdash;after tagging a + changeset + + This hook is run after a tag has been created. + + + Parameters to this hook: + + + local: A boolean. Whether + the new tag is local to this repository instance (i.e. + stored in .hg/localtags) or managed by + Mercurial (stored in .hgtags). + + + node: A changeset ID. The + ID of the changeset that was tagged. + + + tag: A string. The name of + the tag that was created. + + + + If the created tag is revision-controlled, the commit hook (section ) is run before this hook. + + + See also: pretag + () + + + + + <literal role="hook">update</literal>&emdash;after + updating or merging working directory + + This hook is run after an update or merge of the working + directory completes. Since a merge can fail (if the external + hgmerge command fails to resolve conflicts + in a file), this hook communicates whether the update or merge + completed cleanly. + + + + error: A boolean. + Indicates whether the update or merge completed + successfully. + + + parent1: A changeset ID. + The ID of the parent that the working directory was + updated to. If the working directory was merged, it will + not have changed this parent. + + + parent2: A changeset ID. + Only set if the working directory was merged. The ID of + the revision that the working directory was merged with. + + + + See also: preupdate + () + + + + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch10-template.xml --- a/en/ch10-template.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,673 +0,0 @@ - - - - - Customising the output of Mercurial - - Mercurial provides a powerful mechanism to let you control how - it displays information. The mechanism is based on templates. - You can use templates to generate specific output for a single - command, or to customise the entire appearance of the built-in web - interface. - - - Using precanned output styles - - Packaged with Mercurial are some output styles that you can - use immediately. A style is simply a precanned template that - someone wrote and installed somewhere that Mercurial can - find. - - Before we take a look at Mercurial's bundled styles, let's - review its normal output. - - &interaction.template.simple.normal; - - This is somewhat informative, but it takes up a lot of - space&emdash;five lines of output per changeset. The - compact style reduces this to three lines, - presented in a sparse manner. - - &interaction.template.simple.compact; - - The changelog style hints at the - expressive power of Mercurial's templating engine. This style - attempts to follow the GNU Project's changelog - guidelinesweb:changelog. - - &interaction.template.simple.changelog; - - You will not be shocked to learn that Mercurial's default - output style is named default. - - - Setting a default style - - You can modify the output style that Mercurial will use - for every command by editing your ~/.hgrc file, naming the style - you would prefer to use. - - [ui] -style = compact - - If you write a style of your own, you can use it by either - providing the path to your style file, or copying your style - file into a location where Mercurial can find it (typically - the templates subdirectory of your - Mercurial install directory). - - - - - Commands that support styles and templates - - All of Mercurial's - log-like commands let you use - styles and templates: hg - incoming, hg log, - hg outgoing, and hg tip. - - As I write this manual, these are so far the only commands - that support styles and templates. Since these are the most - important commands that need customisable output, there has been - little pressure from the Mercurial user community to add style - and template support to other commands. - - - - The basics of templating - - At its simplest, a Mercurial template is a piece of text. - Some of the text never changes, while other parts are - expanded, or replaced with new text, when - necessary. - - Before we continue, let's look again at a simple example of - Mercurial's normal output. - - &interaction.template.simple.normal; - - Now, let's run the same command, but using a template to - change its output. - - &interaction.template.simple.simplest; - - The example above illustrates the simplest possible - template; it's just a piece of static text, printed once for - each changeset. The option to the hg log command tells Mercurial to use - the given text as the template when printing each - changeset. - - Notice that the template string above ends with the text - \n. This is an - escape sequence, telling Mercurial to print - a newline at the end of each template item. If you omit this - newline, Mercurial will run each piece of output together. See - for more details - of escape sequences. - - A template that prints a fixed string of text all the time - isn't very useful; let's try something a bit more - complex. - - &interaction.template.simple.simplesub; - - As you can see, the string - {desc} in the template has - been replaced in the output with the description of each - changeset. Every time Mercurial finds text enclosed in curly - braces ({ and - }), it will try to replace the - braces and text with the expansion of whatever is inside. To - print a literal curly brace, you must escape it, as described in - . - - - - Common template keywords - - You can start writing simple templates immediately using the - keywords below. - - - author: String. The - unmodified author of the changeset. - - branches: String. The - name of the branch on which the changeset was committed. - Will be empty if the branch name was - default. - - date: - Date information. The date when the changeset was - committed. This is not human-readable; - you must pass it through a filter that will render it - appropriately. See for more information - on filters. The date is expressed as a pair of numbers. The - first number is a Unix UTC timestamp (seconds since January - 1, 1970); the second is the offset of the committer's - timezone from UTC, in seconds. - - desc: - String. The text of the changeset description. - - files: List of strings. - All files modified, added, or removed by this - changeset. - - file_adds: List of - strings. Files added by this changeset. - - file_dels: List of - strings. Files removed by this changeset. - - node: - String. The changeset identification hash, as a - 40-character hexadecimal string. - - parents: List of - strings. The parents of the changeset. - - rev: - Integer. The repository-local changeset revision - number. - - tags: - List of strings. Any tags associated with the - changeset. - - - A few simple experiments will show us what to expect when we - use these keywords; you can see the results below. - -&interaction.template.simple.keywords; - - As we noted above, the date keyword does not produce - human-readable output, so we must treat it specially. This - involves using a filter, about which more - in . - - &interaction.template.simple.datekeyword; - - - - Escape sequences - - Mercurial's templating engine recognises the most commonly - used escape sequences in strings. When it sees a backslash - (\) character, it looks at the - following character and substitutes the two characters with a - single replacement, as described below. - - - \: - Backslash, \, ASCII - 134. - - \n: Newline, - ASCII 12. - - \r: Carriage - return, ASCII 15. - - \t: Tab, ASCII - 11. - - \v: Vertical - tab, ASCII 13. - - {: Open curly - brace, {, ASCII - 173. - - }: Close curly - brace, }, ASCII - 175. - - - As indicated above, if you want the expansion of a template - to contain a literal \, - {, or - { character, you must escape - it. - - - - Filtering keywords to change their results - - Some of the results of template expansion are not - immediately easy to use. Mercurial lets you specify an optional - chain of filters to modify the result of - expanding a keyword. You have already seen a common filter, - isodate, in - action above, to make a date readable. - - Below is a list of the most commonly used filters that - Mercurial supports. While some filters can be applied to any - text, others can only be used in specific circumstances. The - name of each filter is followed first by an indication of where - it can be used, then a description of its effect. - - - addbreaks: Any text. Add - an XHTML <br/> tag - before the end of every line except the last. For example, - foo\nbar becomes - foo<br/>\nbar. - - age: date keyword. Render - the age of the date, relative to the current time. Yields a - string like 10 - minutes. - - basename: Any text, but - most useful for the files keyword and its - relatives. Treat the text as a path, and return the - basename. For example, - foo/bar/baz becomes - baz. - - date: date keyword. Render a - date in a similar format to the Unix date command, but with - timezone included. Yields a string like Mon - Sep 04 15:13:13 2006 -0700. - - domain: Any text, - but most useful for the author keyword. Finds - the first string that looks like an email address, and - extract just the domain component. For example, - Bryan O'Sullivan - <bos@serpentine.com> becomes - serpentine.com. - - email: Any text, - but most useful for the author keyword. Extract - the first string that looks like an email address. For - example, Bryan O'Sullivan - <bos@serpentine.com> becomes - bos@serpentine.com. - - escape: Any text. - Replace the special XML/XHTML characters - &, - < and - > with XML - entities. - - fill68: Any text. Wrap - the text to fit in 68 columns. This is useful before you - pass text through the tabindent filter, and - still want it to fit in an 80-column fixed-font - window. - - fill76: Any text. Wrap - the text to fit in 76 columns. - - firstline: Any text. - Yield the first line of text, without any trailing - newlines. - - hgdate: date keyword. Render - the date as a pair of readable numbers. Yields a string - like 1157407993 - 25200. - - isodate: date keyword. Render - the date as a text string in ISO 8601 format. Yields a - string like 2006-09-04 15:13:13 - -0700. - - obfuscate: Any text, but - most useful for the author keyword. Yield - the input text rendered as a sequence of XML entities. This - helps to defeat some particularly stupid screen-scraping - email harvesting spambots. - - person: Any text, - but most useful for the author keyword. Yield - the text before an email address. For example, - Bryan O'Sullivan - <bos@serpentine.com> becomes - Bryan O'Sullivan. - - rfc822date: - date keyword. - Render a date using the same format used in email headers. - Yields a string like Mon, 04 Sep 2006 - 15:13:13 -0700. - - short: Changeset - hash. Yield the short form of a changeset hash, i.e. a - 12-character hexadecimal string. - - shortdate: date keyword. Render - the year, month, and day of the date. Yields a string like - 2006-09-04. - - strip: - Any text. Strip all leading and trailing whitespace from - the string. - - tabindent: Any text. - Yield the text, with every line except the first starting - with a tab character. - - urlescape: Any text. - Escape all characters that are considered - special by URL parsers. For example, - foo bar becomes - foo%20bar. - - user: Any text, - but most useful for the author keyword. Return - the user portion of an email address. For - example, Bryan O'Sullivan - <bos@serpentine.com> becomes - bos. - - -&interaction.template.simple.manyfilters; - - - If you try to apply a filter to a piece of data that it - cannot process, Mercurial will fail and print a Python - exception. For example, trying to run the output of the - desc keyword into - the isodate - filter is not a good idea. - - - - Combining filters - - It is easy to combine filters to yield output in the form - you would like. The following chain of filters tidies up a - description, then makes sure that it fits cleanly into 68 - columns, then indents it by a further 8 characters (at least - on Unix-like systems, where a tab is conventionally 8 - characters wide). - - &interaction.template.simple.combine; - - Note the use of \t (a - tab character) in the template to force the first line to be - indented; this is necessary since tabindent indents all - lines except the first. - - Keep in mind that the order of filters in a chain is - significant. The first filter is applied to the result of the - keyword; the second to the result of the first filter; and so - on. For example, using fill68|tabindent - gives very different results from - tabindent|fill68. - - - - - - From templates to styles - - A command line template provides a quick and simple way to - format some output. Templates can become verbose, though, and - it's useful to be able to give a template a name. A style file - is a template with a name, stored in a file. - - More than that, using a style file unlocks the power of - Mercurial's templating engine in ways that are not possible - using the command line option. - - - The simplest of style files - - Our simple style file contains just one line: - - &interaction.template.simple.rev; - - This tells Mercurial, if you're printing a - changeset, use the text on the right as the - template. - - - - Style file syntax - - The syntax rules for a style file are simple. - - - The file is processed one line at a - time. - - Leading and trailing white space are - ignored. - - Empty lines are skipped. - - If a line starts with either of the characters - # or - ;, the entire line is - treated as a comment, and skipped as if empty. - - A line starts with a keyword. This must start - with an alphabetic character or underscore, and can - subsequently contain any alphanumeric character or - underscore. (In regexp notation, a keyword must match - [A-Za-z_][A-Za-z0-9_]*.) - - The next element must be an - = character, which can - be preceded or followed by an arbitrary amount of white - space. - - If the rest of the line starts and ends with - matching quote characters (either single or double quote), - it is treated as a template body. - - If the rest of the line does - not start with a quote character, it is - treated as the name of a file; the contents of this file - will be read and used as a template body. - - - - - - Style files by example - - To illustrate how to write a style file, we will construct a - few by example. Rather than provide a complete style file and - walk through it, we'll mirror the usual process of developing a - style file by starting with something very simple, and walking - through a series of successively more complete examples. - - - Identifying mistakes in style files - - If Mercurial encounters a problem in a style file you are - working on, it prints a terse error message that, once you - figure out what it means, is actually quite useful. - -&interaction.template.svnstyle.syntax.input; - - Notice that broken.style attempts to - define a changeset keyword, but forgets to - give any content for it. When instructed to use this style - file, Mercurial promptly complains. - - &interaction.template.svnstyle.syntax.error; - - This error message looks intimidating, but it is not too - hard to follow. - - - The first component is simply Mercurial's way - of saying I am giving up. - ___abort___: broken.style:1: parse error - - Next comes the name of the style file that - contains the error. - abort: ___broken.style___:1: parse error - - Following the file name is the line number - where the error was encountered. - abort: broken.style:___1___: parse error - - Finally, a description of what went - wrong. - abort: broken.style:1: ___parse error___ - - The description of the problem is not always - clear (as in this case), but even when it is cryptic, it - is almost always trivial to visually inspect the offending - line in the style file and see what is wrong. - - - - - Uniquely identifying a repository - - If you would like to be able to identify a Mercurial - repository fairly uniquely using a short string - as an identifier, you can use the first revision in the - repository. - - &interaction.template.svnstyle.id; - - This is not guaranteed to be unique, but it is - nevertheless useful in many cases. - - It will not work in a completely empty - repository, because such a repository does not have a - revision zero. - - Neither will it work in the (extremely rare) - case where a repository is a merge of two or more formerly - independent repositories, and you still have those - repositories around. - - Here are some uses to which you could put this - identifier: - - As a key into a table for a database that - manages repositories on a server. - - As half of a {repository - ID, revision ID} tuple. - Save this information away when you run an automated build - or other activity, so that you can replay - the build later if necessary. - - - - - Mimicking Subversion's output - - Let's try to emulate the default output format used by - another revision control tool, Subversion. - - &interaction.template.svnstyle.short; - - Since Subversion's output style is fairly simple, it is - easy to copy-and-paste a hunk of its output into a file, and - replace the text produced above by Subversion with the - template values we'd like to see expanded. - - &interaction.template.svnstyle.template; - - There are a few small ways in which this template deviates - from the output produced by Subversion. - - Subversion prints a readable - date (the Wed, 27 Sep 2006 in the - example output above) in parentheses. Mercurial's - templating engine does not provide a way to display a date - in this format without also printing the time and time - zone. - - We emulate Subversion's printing of - separator lines full of - - characters by ending - the template with such a line. We use the templating - engine's header - keyword to print a separator line as the first line of - output (see below), thus achieving similar output to - Subversion. - - Subversion's output includes a count in the - header of the number of lines in the commit message. We - cannot replicate this in Mercurial; the templating engine - does not currently provide a filter that counts the number - of lines the template generates. - - It took me no more than a minute or two of work to replace - literal text from an example of Subversion's output with some - keywords and filters to give the template above. The style - file simply refers to the template. - - &interaction.template.svnstyle.style; - - We could have included the text of the template file - directly in the style file by enclosing it in quotes and - replacing the newlines with - \n sequences, but it would - have made the style file too difficult to read. Readability - is a good guide when you're trying to decide whether some text - belongs in a style file, or in a template file that the style - file points to. If the style file will look too big or - cluttered if you insert a literal piece of text, drop it into - a template instead. - - - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch11-mq.xml --- a/en/ch11-mq.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1321 +0,0 @@ - - - - - Managing change with Mercurial Queues - - - The patch management problem - - Here is a common scenario: you need to install a software - package from source, but you find a bug that you must fix in the - source before you can start using the package. You make your - changes, forget about the package for a while, and a few months - later you need to upgrade to a newer version of the package. If - the newer version of the package still has the bug, you must - extract your fix from the older source tree and apply it against - the newer version. This is a tedious task, and it's easy to - make mistakes. - - This is a simple case of the patch management - problem. You have an upstream source tree that - you can't change; you need to make some local changes on top of - the upstream tree; and you'd like to be able to keep those - changes separate, so that you can apply them to newer versions - of the upstream source. - - The patch management problem arises in many situations. - Probably the most visible is that a user of an open source - software project will contribute a bug fix or new feature to the - project's maintainers in the form of a patch. - - Distributors of operating systems that include open source - software often need to make changes to the packages they - distribute so that they will build properly in their - environments. - - When you have few changes to maintain, it is easy to manage - a single patch using the standard diff and - patch programs (see for a discussion of these - tools). Once the number of changes grows, it starts to make - sense to maintain patches as discrete chunks of - work, so that for example a single patch will contain - only one bug fix (the patch might modify several files, but it's - doing only one thing), and you may have a number - of such patches for different bugs you need fixed and local - changes you require. In this situation, if you submit a bug fix - patch to the upstream maintainers of a package and they include - your fix in a subsequent release, you can simply drop that - single patch when you're updating to the newer release. - - Maintaining a single patch against an upstream tree is a - little tedious and error-prone, but not difficult. However, the - complexity of the problem grows rapidly as the number of patches - you have to maintain increases. With more than a tiny number of - patches in hand, understanding which ones you have applied and - maintaining them moves from messy to overwhelming. - - Fortunately, Mercurial includes a powerful extension, - Mercurial Queues (or simply MQ), that massively - simplifies the patch management problem. - - - - The prehistory of Mercurial Queues - - During the late 1990s, several Linux kernel developers - started to maintain patch series that modified - the behavior of the Linux kernel. Some of these series were - focused on stability, some on feature coverage, and others were - more speculative. - - The sizes of these patch series grew rapidly. In 2002, - Andrew Morton published some shell scripts he had been using to - automate the task of managing his patch queues. Andrew was - successfully using these scripts to manage hundreds (sometimes - thousands) of patches on top of the Linux kernel. - - - A patchwork quilt - - In early 2003, Andreas Gruenbacher and Martin Quinson - borrowed the approach of Andrew's scripts and published a tool - called patchwork quilt - web:quilt, or simply quilt - (see gruenbacher:2005 for a paper - describing it). Because quilt substantially automated patch - management, it rapidly gained a large following among open - source software developers. - - Quilt manages a stack of patches on - top of a directory tree. To begin, you tell quilt to manage a - directory tree, and tell it which files you want to manage; it - stores away the names and contents of those files. To fix a - bug, you create a new patch (using a single command), edit the - files you need to fix, then refresh the - patch. - - The refresh step causes quilt to scan the directory tree; - it updates the patch with all of the changes you have made. - You can create another patch on top of the first, which will - track the changes required to modify the tree from tree - with one patch applied to tree with two - patches applied. - - You can change which patches are - applied to the tree. If you pop a patch, the - changes made by that patch will vanish from the directory - tree. Quilt remembers which patches you have popped, though, - so you can push a popped patch again, and the - directory tree will be restored to contain the modifications - in the patch. Most importantly, you can run the - refresh command at any time, and the topmost - applied patch will be updated. This means that you can, at - any time, change both which patches are applied and what - modifications those patches make. - - Quilt knows nothing about revision control tools, so it - works equally well on top of an unpacked tarball or a - Subversion working copy. - - - - From patchwork quilt to Mercurial Queues - - In mid-2005, Chris Mason took the features of quilt and - wrote an extension that he called Mercurial Queues, which - added quilt-like behavior to Mercurial. - - The key difference between quilt and MQ is that quilt - knows nothing about revision control systems, while MQ is - integrated into Mercurial. Each patch - that you push is represented as a Mercurial changeset. Pop a - patch, and the changeset goes away. - - Because quilt does not care about revision control tools, - it is still a tremendously useful piece of software to know - about for situations where you cannot use Mercurial and - MQ. - - - - - The huge advantage of MQ - - I cannot overstate the value that MQ offers through the - unification of patches and revision control. - - A major reason that patches have persisted in the free - software and open source world&emdash;in spite of the - availability of increasingly capable revision control tools over - the years&emdash;is the agility they - offer. - - Traditional revision control tools make a permanent, - irreversible record of everything that you do. While this has - great value, it's also somewhat stifling. If you want to - perform a wild-eyed experiment, you have to be careful in how - you go about it, or you risk leaving unneeded&emdash;or worse, - misleading or destabilising&emdash;traces of your missteps and - errors in the permanent revision record. - - By contrast, MQ's marriage of distributed revision control - with patches makes it much easier to isolate your work. Your - patches live on top of normal revision history, and you can make - them disappear or reappear at will. If you don't like a patch, - you can drop it. If a patch isn't quite as you want it to be, - simply fix it&emdash;as many times as you need to, until you - have refined it into the form you desire. - - As an example, the integration of patches with revision - control makes understanding patches and debugging their - effects&emdash;and their interplay with the code they're based - on&emdash;enormously easier. Since every - applied patch has an associated changeset, you can give hg log a file name to see which - changesets and patches affected the file. You can use the - hg bisect command to - binary-search through all changesets and applied patches to see - where a bug got introduced or fixed. You can use the hg annotate command to see which - changeset or patch modified a particular line of a source file. - And so on. - - - - Understanding patches - - Because MQ doesn't hide its patch-oriented nature, it is - helpful to understand what patches are, and a little about the - tools that work with them. - - The traditional Unix diff command - compares two files, and prints a list of differences between - them. The patch command understands these - differences as modifications to make to a - file. Take a look below for a simple example of these commands - in action. - -&interaction.mq.dodiff.diff; - - The type of file that diff generates (and - patch takes as input) is called a - patch or a diff; there is no - difference between a patch and a diff. (We'll use the term - patch, since it's more commonly used.) - - A patch file can start with arbitrary text; the - patch command ignores this text, but MQ uses - it as the commit message when creating changesets. To find the - beginning of the patch content, patch - searches for the first line that starts with the string - diff -. - - MQ works with unified diffs - (patch can accept several other diff formats, - but MQ doesn't). A unified diff contains two kinds of header. - The file header describes the file being - modified; it contains the name of the file to modify. When - patch sees a new file header, it looks for a - file with that name to start modifying. - - After the file header comes a series of - hunks. Each hunk starts with a header; - this identifies the range of line numbers within the file that - the hunk should modify. Following the header, a hunk starts and - ends with a few (usually three) lines of text from the - unmodified file; these are called the - context for the hunk. If there's only a - small amount of context between successive hunks, - diff doesn't print a new hunk header; it just - runs the hunks together, with a few lines of context between - modifications. - - Each line of context begins with a space character. Within - the hunk, a line that begins with - - means remove this - line, while a line that begins with - + means insert this - line. For example, a line that is modified is - represented by one deletion and one insertion. - - We will return to some of the more subtle aspects of patches - later (in ), but you - should have - enough information now to use MQ. - - - - Getting started with Mercurial Queues - - Because MQ is implemented as an extension, you must - explicitly enable before you can use it. (You don't need to - download anything; MQ ships with the standard Mercurial - distribution.) To enable MQ, edit your ~/.hgrc file, and add the lines - below. - - [extensions] -hgext.mq = - - Once the extension is enabled, it will make a number of new - commands available. To verify that the extension is working, - you can use hg help to see if - the qinit command is now - available. - -&interaction.mq.qinit-help.help; - - You can use MQ with any Mercurial - repository, and its commands only operate within that - repository. To get started, simply prepare the repository using - the qinit command. - -&interaction.mq.tutorial.qinit; - - This command creates an empty directory called .hg/patches, where - MQ will keep its metadata. As with many Mercurial commands, the - qinit command prints nothing - if it succeeds. - - - Creating a new patch - - To begin work on a new patch, use the qnew command. This command takes - one argument, the name of the patch to create. - - MQ will use this as the name of an actual file in the - .hg/patches directory, as you - can see below. - -&interaction.mq.tutorial.qnew; - - Also newly present in the .hg/patches directory are two - other files, series and - status. The series file lists all of the - patches that MQ knows about for this repository, with one - patch per line. Mercurial uses the status file for internal - book-keeping; it tracks all of the patches that MQ has - applied in this repository. - - - You may sometimes want to edit the series file by hand; for - example, to change the sequence in which some patches are - applied. However, manually editing the status file is almost always a - bad idea, as it's easy to corrupt MQ's idea of what is - happening. - - - Once you have created your new patch, you can edit files - in the working directory as you usually would. All of the - normal Mercurial commands, such as hg - diff and hg - annotate, work exactly as they did before. - - - - Refreshing a patch - - When you reach a point where you want to save your work, - use the qrefresh command - to update the patch you are working on. - -&interaction.mq.tutorial.qrefresh; - - This command folds the changes you have made in the - working directory into your patch, and updates its - corresponding changeset to contain those changes. - - You can run qrefresh - as often as you like, so it's a good way to - checkpoint your work. Refresh your patch at an - opportune time; try an experiment; and if the experiment - doesn't work out, hg revert - your modifications back to the last time you refreshed. - -&interaction.mq.tutorial.qrefresh2; - - - - Stacking and tracking patches - - Once you have finished working on a patch, or need to work - on another, you can use the qnew command again to create a - new patch. Mercurial will apply this patch on top of your - existing patch. - -&interaction.mq.tutorial.qnew2; - Notice that the patch contains the changes in our prior - patch as part of its context (you can see this more clearly in - the output of hg - annotate). - - So far, with the exception of qnew and qrefresh, we've been careful to - only use regular Mercurial commands. However, MQ provides - many commands that are easier to use when you are thinking - about patches, as illustrated below. - -&interaction.mq.tutorial.qseries; - - - The qseries command lists every - patch that MQ knows about in this repository, from oldest - to newest (most recently - created). - - The qapplied command lists every - patch that MQ has applied in this - repository, again from oldest to newest (most recently - applied). - - - - - Manipulating the patch stack - - The previous discussion implied that there must be a - difference between known and - applied patches, and there is. MQ can manage a - patch without it being applied in the repository. - - An applied patch has a corresponding - changeset in the repository, and the effects of the patch and - changeset are visible in the working directory. You can undo - the application of a patch using the qpop command. MQ still - knows about, or manages, a popped patch, - but the patch no longer has a corresponding changeset in the - repository, and the working directory does not contain the - changes made by the patch. illustrates - the difference between applied and tracked patches. - -
- Applied and unapplied patches in the MQ patch - stack - - - XXX add text - -
- - You can reapply an unapplied, or popped, patch using the - qpush command. This - creates a new changeset to correspond to the patch, and the - patch's changes once again become present in the working - directory. See below for examples of qpop and qpush in action. -&interaction.mq.tutorial.qpop; - - Notice that once we have popped a patch or two patches, - the output of qseries - remains the same, while that of qapplied has changed. - - -
- - Pushing and popping many patches - - While qpush and - qpop each operate on a - single patch at a time by default, you can push and pop many - patches in one go. The option to - qpush causes it to push - all unapplied patches, while the option to qpop causes it to pop all applied - patches. (For some more ways to push and pop many patches, - see below.) - -&interaction.mq.tutorial.qpush-a; - - - - Safety checks, and overriding them - - Several MQ commands check the working directory before - they do anything, and fail if they find any modifications. - They do this to ensure that you won't lose any changes that - you have made, but not yet incorporated into a patch. The - example below illustrates this; the qnew command will not create a - new patch if there are outstanding changes, caused in this - case by the hg add of - file3. - -&interaction.mq.tutorial.add; - - Commands that check the working directory all take an - I know what I'm doing option, which is always - named . The exact meaning of - depends on the command. For example, - hg qnew - will incorporate any outstanding changes into the new patch it - creates, but hg qpop - will revert modifications to any files affected by the patch - that it is popping. Be sure to read the documentation for a - command's option before you use it! - - - - Working on several patches at once - - The qrefresh command - always refreshes the topmost applied - patch. This means that you can suspend work on one patch (by - refreshing it), pop or push to make a different patch the top, - and work on that patch for a - while. - - Here's an example that illustrates how you can use this - ability. Let's say you're developing a new feature as two - patches. The first is a change to the core of your software, - and the second&emdash;layered on top of the - first&emdash;changes the user interface to use the code you - just added to the core. If you notice a bug in the core while - you're working on the UI patch, it's easy to fix the core. - Simply qrefresh the UI - patch to save your in-progress changes, and qpop down to the core patch. Fix - the core bug, qrefresh the - core patch, and qpush back - to the UI patch to continue where you left off. - - -
- - More about patches - - MQ uses the GNU patch command to apply - patches, so it's helpful to know a few more detailed aspects of - how patch works, and about patches - themselves. - - - The strip count - - If you look at the file headers in a patch, you will - notice that the pathnames usually have an extra component on - the front that isn't present in the actual path name. This is - a holdover from the way that people used to generate patches - (people still do this, but it's somewhat rare with modern - revision control tools). - - Alice would unpack a tarball, edit her files, then decide - that she wanted to create a patch. So she'd rename her - working directory, unpack the tarball again (hence the need - for the rename), and use the and options to - diff to recursively generate a patch - between the unmodified directory and the modified one. The - result would be that the name of the unmodified directory - would be at the front of the left-hand path in every file - header, and the name of the modified directory would be at the - front of the right-hand path. - - Since someone receiving a patch from the Alices of the net - would be unlikely to have unmodified and modified directories - with exactly the same names, the patch - command has a option - that indicates the number of leading path name components to - strip when trying to apply a patch. This number is called the - strip count. - - An option of -p1 means - use a strip count of one. If - patch sees a file name - foo/bar/baz in a file header, it will - strip foo and try to patch a file named - bar/baz. (Strictly speaking, the strip - count refers to the number of path - separators (and the components that go with them - ) to strip. A strip count of one will turn - foo/bar into bar, - but /foo/bar (notice the extra leading - slash) into foo/bar.) - - The standard strip count for patches is - one; almost all patches contain one leading path name - component that needs to be stripped. Mercurial's hg diff command generates path names - in this form, and the hg - import command and MQ expect patches to have a - strip count of one. - - If you receive a patch from someone that you want to add - to your patch queue, and the patch needs a strip count other - than one, you cannot just qimport the patch, because - qimport does not yet have - a -p option (see issue - 311). Your best bet is to qnew a patch of your own, then - use patch -pN to apply their patch, - followed by hg addremove to - pick up any files added or removed by the patch, followed by - hg qrefresh. This - complexity may become unnecessary; see issue - 311 for details. - - - - Strategies for applying a patch - - When patch applies a hunk, it tries a - handful of successively less accurate strategies to try to - make the hunk apply. This falling-back technique often makes - it possible to take a patch that was generated against an old - version of a file, and apply it against a newer version of - that file. - - First, patch tries an exact match, - where the line numbers, the context, and the text to be - modified must apply exactly. If it cannot make an exact - match, it tries to find an exact match for the context, - without honouring the line numbering information. If this - succeeds, it prints a line of output saying that the hunk was - applied, but at some offset from the - original line number. - - If a context-only match fails, patch - removes the first and last lines of the context, and tries a - reduced context-only match. If the hunk - with reduced context succeeds, it prints a message saying that - it applied the hunk with a fuzz factor - (the number after the fuzz factor indicates how many lines of - context patch had to trim before the patch - applied). - - When neither of these techniques works, - patch prints a message saying that the hunk - in question was rejected. It saves rejected hunks (also - simply called rejects) to a file with the same - name, and an added .rej - extension. It also saves an unmodified copy of the file with - a .orig extension; the - copy of the file without any extensions will contain any - changes made by hunks that did apply - cleanly. If you have a patch that modifies - foo with six hunks, and one of them fails - to apply, you will have: an unmodified - foo.orig, a foo.rej - containing one hunk, and foo, containing - the changes made by the five successful hunks. - - - - Some quirks of patch representation - - There are a few useful things to know about how - patch works with files. - - This should already be obvious, but - patch cannot handle binary - files. - - Neither does it care about the executable bit; - it creates new files as readable, but not - executable. - - patch treats the removal of - a file as a diff between the file to be removed and the - empty file. So your idea of I deleted this - file looks like every line of this file - was deleted in a patch. - - It treats the addition of a file as a diff - between the empty file and the file to be added. So in a - patch, your idea of I added this file looks - like every line of this file was - added. - - It treats a renamed file as the removal of the - old name, and the addition of the new name. This means - that renamed files have a big footprint in patches. (Note - also that Mercurial does not currently try to infer when - files have been renamed or copied in a patch.) - - patch cannot represent - empty files, so you cannot use a patch to represent the - notion I added this empty file to the - tree. - - - - Beware the fuzz - - While applying a hunk at an offset, or with a fuzz factor, - will often be completely successful, these inexact techniques - naturally leave open the possibility of corrupting the patched - file. The most common cases typically involve applying a - patch twice, or at an incorrect location in the file. If - patch or qpush ever mentions an offset or - fuzz factor, you should make sure that the modified files are - correct afterwards. - - It's often a good idea to refresh a patch that has applied - with an offset or fuzz factor; refreshing the patch generates - new context information that will make it apply cleanly. I - say often, not always, because - sometimes refreshing a patch will make it fail to apply - against a different revision of the underlying files. In some - cases, such as when you're maintaining a patch that must sit - on top of multiple versions of a source tree, it's acceptable - to have a patch apply with some fuzz, provided you've verified - the results of the patching process in such cases. - - - - Handling rejection - - If qpush fails to - apply a patch, it will print an error message and exit. If it - has left .rej files - behind, it is usually best to fix up the rejected hunks before - you push more patches or do any further work. - - If your patch used to apply cleanly, - and no longer does because you've changed the underlying code - that your patches are based on, Mercurial Queues can help; see - for details. - - Unfortunately, there aren't any great techniques for - dealing with rejected hunks. Most often, you'll need to view - the .rej file and edit the - target file, applying the rejected hunks by hand. - - If you're feeling adventurous, Neil Brown, a Linux kernel - hacker, wrote a tool called wiggle - web:wiggle, which is more vigorous than - patch in its attempts to make a patch - apply. - - Another Linux kernel hacker, Chris Mason (the author of - Mercurial Queues), wrote a similar tool called - mpatch web:mpatch, - which takes a simple approach to automating the application of - hunks rejected by patch. The - mpatch command can help with four common - reasons that a hunk may be rejected: - - - The context in the middle of a hunk has - changed. - - A hunk is missing some context at the - beginning or end. - - A large hunk might apply better&emdash;either - entirely or in part&emdash;if it was broken up into - smaller hunks. - - A hunk removes lines with slightly different - content than those currently present in the file. - - - If you use wiggle or - mpatch, you should be doubly careful to - check your results when you're done. In fact, - mpatch enforces this method of - double-checking the tool's output, by automatically dropping - you into a merge program when it has done its job, so that you - can verify its work and finish off any remaining - merges. - - - - - Getting the best performance out of MQ - - MQ is very efficient at handling a large number of patches. - I ran some performance experiments in mid-2006 for a talk that I - gave at the 2006 EuroPython conference - web:europython. I used as my data set the - Linux 2.6.17-mm1 patch series, which consists of 1,738 patches. - I applied these on top of a Linux kernel repository containing - all 27,472 revisions between Linux 2.6.12-rc2 and Linux - 2.6.17. - - On my old, slow laptop, I was able to hg qpush all - 1,738 patches in 3.5 minutes, and hg qpop - - them all in 30 seconds. (On a newer laptop, the time to push - all patches dropped to two minutes.) I could qrefresh one of the biggest patches - (which made 22,779 lines of changes to 287 files) in 6.6 - seconds. - - Clearly, MQ is well suited to working in large trees, but - there are a few tricks you can use to get the best performance - of it. - - First of all, try to batch operations - together. Every time you run qpush or qpop, these commands scan the - working directory once to make sure you haven't made some - changes and then forgotten to run qrefresh. On a small tree, the - time that this scan takes is unnoticeable. However, on a - medium-sized tree (containing tens of thousands of files), it - can take a second or more. - - The qpush and qpop commands allow you to push and - pop multiple patches at a time. You can identify the - destination patch that you want to end up at. - When you qpush with a - destination specified, it will push patches until that patch is - at the top of the applied stack. When you qpop to a destination, MQ will pop - patches until the destination patch is at the top. - - You can identify a destination patch using either the name - of the patch, or by number. If you use numeric addressing, - patches are counted from zero; this means that the first patch - is zero, the second is one, and so on. - - - - Updating your patches when the underlying code - changes - - It's common to have a stack of patches on top of an - underlying repository that you don't modify directly. If you're - working on changes to third-party code, or on a feature that is - taking longer to develop than the rate of change of the code - beneath, you will often need to sync up with the underlying - code, and fix up any hunks in your patches that no longer apply. - This is called rebasing your patch - series. - - The simplest way to do this is to hg - qpop your patches, then hg pull changes into the underlying - repository, and finally hg qpush your - patches again. MQ will stop pushing any time it runs across a - patch that fails to apply during conflicts, allowing you to fix - your conflicts, qrefresh the - affected patch, and continue pushing until you have fixed your - entire stack. - - This approach is easy to use and works well if you don't - expect changes to the underlying code to affect how well your - patches apply. If your patch stack touches code that is modified - frequently or invasively in the underlying repository, however, - fixing up rejected hunks by hand quickly becomes - tiresome. - - It's possible to partially automate the rebasing process. - If your patches apply cleanly against some revision of the - underlying repo, MQ can use this information to help you to - resolve conflicts between your patches and a different - revision. - - The process is a little involved. - - To begin, hg qpush - -a all of your patches on top of the revision - where you know that they apply cleanly. - - Save a backup copy of your patch directory using - hg qsave . - This prints the name of the directory that it has saved the - patches in. It will save the patches to a directory called - .hg/patches.N, where - N is a small integer. It also commits a - save changeset on top of your applied - patches; this is for internal book-keeping, and records the - states of the series and - status files. - - Use hg pull to - bring new changes into the underlying repository. (Don't - run hg pull -u; see below - for why.) - - Update to the new tip revision, using hg update to override - the patches you have pushed. - - Merge all patches using hg qpush -m - -a. The option to - qpush tells MQ to - perform a three-way merge if the patch fails to - apply. - - - During the hg qpush , - each patch in the series - file is applied normally. If a patch applies with fuzz or - rejects, MQ looks at the queue you qsaved, and performs a three-way - merge with the corresponding changeset. This merge uses - Mercurial's normal merge machinery, so it may pop up a GUI merge - tool to help you to resolve problems. - - When you finish resolving the effects of a patch, MQ - refreshes your patch based on the result of the merge. - - At the end of this process, your repository will have one - extra head from the old patch queue, and a copy of the old patch - queue will be in .hg/patches.N. You can remove the - extra head using hg qpop -a -n - patches.N or hg - strip. You can delete .hg/patches.N once you are sure - that you no longer need it as a backup. - - - - Identifying patches - - MQ commands that work with patches let you refer to a patch - either by using its name or by a number. By name is obvious - enough; pass the name foo.patch to qpush, for example, and it will - push patches until foo.patch is - applied. - - As a shortcut, you can refer to a patch using both a name - and a numeric offset; foo.patch-2 means - two patches before foo.patch, - while bar.patch+4 means four patches - after bar.patch. - - Referring to a patch by index isn't much different. The - first patch printed in the output of qseries is patch zero (yes, it's - one of those start-at-zero counting systems); the second is - patch one; and so on. - - MQ also makes it easy to work with patches when you are - using normal Mercurial commands. Every command that accepts a - changeset ID will also accept the name of an applied patch. MQ - augments the tags normally in the repository with an eponymous - one for each applied patch. In addition, the special tags - qbase and - qtip identify - the bottom-most and topmost applied patches, - respectively. - - These additions to Mercurial's normal tagging capabilities - make dealing with patches even more of a breeze. - - Want to patchbomb a mailing list with your - latest series of changes? - hg email qbase:qtip - (Don't know what patchbombing is? See - .) - - Need to see all of the patches since - foo.patch that have touched files in a - subdirectory of your tree? - hg log -r foo.patch:qtip subdir - - - - Because MQ makes the names of patches available to the rest - of Mercurial through its normal internal tag machinery, you - don't need to type in the entire name of a patch when you want - to identify it by name. - - Another nice consequence of representing patch names as tags - is that when you run the hg log - command, it will display a patch's name as a tag, simply as part - of its normal output. This makes it easy to visually - distinguish applied patches from underlying - normal revisions. The following example shows a - few normal Mercurial commands in use with applied - patches. - -&interaction.mq.id.output; - - - - Useful things to know about - - There are a number of aspects of MQ usage that don't fit - tidily into sections of their own, but that are good to know. - Here they are, in one place. - - - Normally, when you qpop a patch and qpush it again, the changeset - that represents the patch after the pop/push will have a - different identity than the changeset - that represented the hash beforehand. See for - information as to why this is. - - It's not a good idea to hg merge changes from another - branch with a patch changeset, at least if you want to - maintain the patchiness of that changeset and - changesets below it on the patch stack. If you try to do - this, it will appear to succeed, but MQ will become - confused. - - - - - Managing patches in a repository - - Because MQ's .hg/patches directory resides - outside a Mercurial repository's working directory, the - underlying Mercurial repository knows nothing - about the management or presence of patches. - - This presents the interesting possibility of managing the - contents of the patch directory as a Mercurial repository in its - own right. This can be a useful way to work. For example, you - can work on a patch for a while, qrefresh it, then hg commit the current state of the - patch. This lets you roll back to that version - of the patch later on. - - You can then share different versions of the same patch - stack among multiple underlying repositories. I use this when I - am developing a Linux kernel feature. I have a pristine copy of - my kernel sources for each of several CPU architectures, and a - cloned repository under each that contains the patches I am - working on. When I want to test a change on a different - architecture, I push my current patches to the patch repository - associated with that kernel tree, pop and push all of my - patches, and build and test that kernel. - - Managing patches in a repository makes it possible for - multiple developers to work on the same patch series without - colliding with each other, all on top of an underlying source - base that they may or may not control. - - - MQ support for patch repositories - - MQ helps you to work with the .hg/patches directory as a - repository; when you prepare a repository for working with - patches using qinit, you - can pass the option to create the .hg/patches directory as a - Mercurial repository. - - - If you forget to use the option, you - can simply go into the .hg/patches directory at any - time and run hg init. - Don't forget to add an entry for the status file to the .hgignore file, though - - (hg qinit - does this for you automatically); you - really don't want to manage the - status file. - - - As a convenience, if MQ notices that the .hg/patches directory is a - repository, it will automatically hg - add every patch that you create and import. - - MQ provides a shortcut command, qcommit, that runs hg commit in the .hg/patches - directory. This saves some bothersome typing. - - Finally, as a convenience to manage the patch directory, - you can define the alias mq on Unix - systems. For example, on Linux systems using the - bash shell, you can include the following - snippet in your ~/.bashrc. - - alias mq=`hg -R $(hg root)/.hg/patches' - - You can then issue commands of the form mq - pull from the main repository. - - - - A few things to watch out for - - MQ's support for working with a repository full of patches - is limited in a few small respects. - - MQ cannot automatically detect changes that you make to - the patch directory. If you hg - pull, manually edit, or hg - update changes to patches or the series file, you will have to - hg qpop and - then hg qpush in - the underlying repository to see those changes show up there. - If you forget to do this, you can confuse MQ's idea of which - patches are applied. - - - - - Third party tools for working with patches - - Once you've been working with patches for a while, you'll - find yourself hungry for tools that will help you to understand - and manipulate the patches you're dealing with. - - The diffstat command - web:diffstat generates a histogram of the - modifications made to each file in a patch. It provides a good - way to get a sense of a patch&emdash;which files - it affects, and how much change it introduces to each file and - as a whole. (I find that it's a good idea to use - diffstat's option as a matter of - course, as otherwise it will try to do clever things with - prefixes of file names that inevitably confuse at least - me.) - -&interaction.mq.tools.tools; - - The patchutils package - web:patchutils is invaluable. It provides a - set of small utilities that follow the Unix - philosophy; each does one useful thing with a patch. - The patchutils command I use - most is filterdiff, which extracts subsets - from a patch file. For example, given a patch that modifies - hundreds of files across dozens of directories, a single - invocation of filterdiff can generate a - smaller patch that only touches files whose names match a - particular glob pattern. See for another - example. - - - - Good ways to work with patches - - Whether you are working on a patch series to submit to a - free software or open source project, or a series that you - intend to treat as a sequence of regular changesets when you're - done, you can use some simple techniques to keep your work well - organized. - - Give your patches descriptive names. A good name for a - patch might be rework-device-alloc.patch, - because it will immediately give you a hint what the purpose of - the patch is. Long names shouldn't be a problem; you won't be - typing the names often, but you will be - running commands like qapplied and qtop over and over. Good naming - becomes especially important when you have a number of patches - to work with, or if you are juggling a number of different tasks - and your patches only get a fraction of your attention. - - Be aware of what patch you're working on. Use the qtop command and skim over the text - of your patches frequently&emdash;for example, using hg tip )&emdash;to be sure - of where you stand. I have several times worked on and qrefreshed a patch other than the - one I intended, and it's often tricky to migrate changes into - the right patch after making them in the wrong one. - - For this reason, it is very much worth investing a little - time to learn how to use some of the third-party tools I - described in , - particularly - diffstat and filterdiff. - The former will give you a quick idea of what changes your patch - is making, while the latter makes it easy to splice hunks - selectively out of one patch and into another. - - - - MQ cookbook - - - Manage <quote>trivial</quote> patches - - Because the overhead of dropping files into a new - Mercurial repository is so low, it makes a lot of sense to - manage patches this way even if you simply want to make a few - changes to a source tarball that you downloaded. - - Begin by downloading and unpacking the source tarball, and - turning it into a Mercurial repository. - - &interaction.mq.tarball.download; - - Continue by creating a patch stack and making your - changes. - - &interaction.mq.tarball.qinit; - - Let's say a few weeks or months pass, and your package - author releases a new version. First, bring their changes - into the repository. - - &interaction.mq.tarball.newsource; - - The pipeline starting with hg - locate above deletes all files in the working - directory, so that hg - commit's option can - actually tell which files have really been removed in the - newer version of the source. - - Finally, you can apply your patches on top of the new - tree. - - &interaction.mq.tarball.repush; - - - - Combining entire patches - - MQ provides a command, qfold that lets you combine - entire patches. This folds the patches you - name, in the order you name them, into the topmost applied - patch, and concatenates their descriptions onto the end of its - description. The patches that you fold must be unapplied - before you fold them. - - The order in which you fold patches matters. If your - topmost applied patch is foo, and you - qfold - bar and quux into it, - you will end up with a patch that has the same effect as if - you applied first foo, then - bar, followed by - quux. - - - - Merging part of one patch into another - - Merging part of one patch into - another is more difficult than combining entire - patches. - - If you want to move changes to entire files, you can use - filterdiff's and options to choose the - modifications to snip out of one patch, concatenating its - output onto the end of the patch you want to merge into. You - usually won't need to modify the patch you've merged the - changes from. Instead, MQ will report some rejected hunks - when you qpush it (from - the hunks you moved into the other patch), and you can simply - qrefresh the patch to drop - the duplicate hunks. - - If you have a patch that has multiple hunks modifying a - file, and you only want to move a few of those hunks, the job - becomes more messy, but you can still partly automate it. Use - lsdiff -nvv to print some metadata about - the patch. - - &interaction.mq.tools.lsdiff; - - This command prints three different kinds of - number: - - (in the first column) a file - number to identify each file modified in the - patch; - - (on the next line, indented) the line number - within a modified file where a hunk starts; and - - (on the same line) a hunk - number to identify that hunk. - - - You'll have to use some visual inspection, and reading of - the patch, to identify the file and hunk numbers you'll want, - but you can then pass them to to - filterdiff's and options, to - select exactly the file and hunk you want to extract. - - Once you have this hunk, you can concatenate it onto the - end of your destination patch and continue with the remainder - of . - - - - - Differences between quilt and MQ - - If you are already familiar with quilt, MQ provides a - similar command set. There are a few differences in the way - that it works. - - You will already have noticed that most quilt commands have - MQ counterparts that simply begin with a - q. The exceptions are quilt's - add and remove commands, - the counterparts for which are the normal Mercurial hg add and hg - remove commands. There is no MQ equivalent of the - quilt edit command. - - -
- - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch11-template.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch11-template.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,685 @@ + + + + + Customizing the output of Mercurial + + Mercurial provides a powerful mechanism to let you control how + it displays information. The mechanism is based on templates. + You can use templates to generate specific output for a single + command, or to customize the entire appearance of the built-in web + interface. + + + Using precanned output styles + + Packaged with Mercurial are some output styles that you can + use immediately. A style is simply a precanned template that + someone wrote and installed somewhere that Mercurial can + find. + + Before we take a look at Mercurial's bundled styles, let's + review its normal output. + + &interaction.template.simple.normal; + + This is somewhat informative, but it takes up a lot of + space&emdash;five lines of output per changeset. The + compact style reduces this to three lines, + presented in a sparse manner. + + &interaction.template.simple.compact; + + The changelog style hints at the + expressive power of Mercurial's templating engine. This style + attempts to follow the GNU Project's changelog + guidelinesweb:changelog. + + &interaction.template.simple.changelog; + + You will not be shocked to learn that Mercurial's default + output style is named default. + + + Setting a default style + + You can modify the output style that Mercurial will use + for every command by editing your ~/.hgrc file, naming the style + you would prefer to use. + + [ui] +style = compact + + If you write a style of your own, you can use it by either + providing the path to your style file, or copying your style + file into a location where Mercurial can find it (typically + the templates subdirectory of your + Mercurial install directory). + + + + + Commands that support styles and templates + + All of Mercurial's + log-like commands let you use + styles and templates: hg + incoming, hg log, + hg outgoing, and hg tip. + + As I write this manual, these are so far the only commands + that support styles and templates. Since these are the most + important commands that need customizable output, there has been + little pressure from the Mercurial user community to add style + and template support to other commands. + + + + The basics of templating + + At its simplest, a Mercurial template is a piece of text. + Some of the text never changes, while other parts are + expanded, or replaced with new text, when + necessary. + + Before we continue, let's look again at a simple example of + Mercurial's normal output. + + &interaction.template.simple.normal; + + Now, let's run the same command, but using a template to + change its output. + + &interaction.template.simple.simplest; + + The example above illustrates the simplest possible + template; it's just a piece of static text, printed once for + each changeset. The option to the hg log command tells Mercurial to use + the given text as the template when printing each + changeset. + + Notice that the template string above ends with the text + \n. This is an + escape sequence, telling Mercurial to print + a newline at the end of each template item. If you omit this + newline, Mercurial will run each piece of output together. See + for more details + of escape sequences. + + A template that prints a fixed string of text all the time + isn't very useful; let's try something a bit more + complex. + + &interaction.template.simple.simplesub; + + As you can see, the string + {desc} in the template has + been replaced in the output with the description of each + changeset. Every time Mercurial finds text enclosed in curly + braces ({ and + }), it will try to replace the + braces and text with the expansion of whatever is inside. To + print a literal curly brace, you must escape it, as described in + . + + + + Common template keywords + + You can start writing simple templates immediately using the + keywords below. + + + author: String. The + unmodified author of the changeset. + + branches: String. The + name of the branch on which the changeset was committed. + Will be empty if the branch name was + default. + + date: + Date information. The date when the changeset was + committed. This is not human-readable; + you must pass it through a filter that will render it + appropriately. See for more information + on filters. The date is expressed as a pair of numbers. The + first number is a Unix UTC timestamp (seconds since January + 1, 1970); the second is the offset of the committer's + timezone from UTC, in seconds. + + desc: + String. The text of the changeset description. + + files: List of strings. + All files modified, added, or removed by this + changeset. + + file_adds: List of + strings. Files added by this changeset. + + file_dels: List of + strings. Files removed by this changeset. + + node: + String. The changeset identification hash, as a + 40-character hexadecimal string. + + parents: List of + strings. The parents of the changeset. + + rev: + Integer. The repository-local changeset revision + number. + + tags: + List of strings. Any tags associated with the + changeset. + + + + A few simple experiments will show us what to expect when we + use these keywords; you can see the results below. + + &interaction.template.simple.keywords; + + As we noted above, the date keyword does not produce + human-readable output, so we must treat it specially. This + involves using a filter, about which more + in . + + &interaction.template.simple.datekeyword; + + + + Escape sequences + + Mercurial's templating engine recognises the most commonly + used escape sequences in strings. When it sees a backslash + (\) character, it looks at the + following character and substitutes the two characters with a + single replacement, as described below. + + + \: + Backslash, \, ASCII + 134. + + \n: Newline, + ASCII 12. + + \r: Carriage + return, ASCII 15. + + \t: Tab, ASCII + 11. + + \v: Vertical + tab, ASCII 13. + + \{: Open curly + brace, {, ASCII + 173. + + \}: Close curly + brace, }, ASCII + 175. + + + As indicated above, if you want the expansion of a template + to contain a literal \, + {, or + { character, you must escape + it. + + + + Filtering keywords to change their results + + Some of the results of template expansion are not + immediately easy to use. Mercurial lets you specify an optional + chain of filters to modify the result of + expanding a keyword. You have already seen a common filter, + isodate, in + action above, to make a date readable. + + Below is a list of the most commonly used filters that + Mercurial supports. While some filters can be applied to any + text, others can only be used in specific circumstances. The + name of each filter is followed first by an indication of where + it can be used, then a description of its effect. + + + addbreaks: Any text. Add + an XHTML <br/> tag + before the end of every line except the last. For example, + foo\nbar becomes + foo<br/>\nbar. + + age: date keyword. Render + the age of the date, relative to the current time. Yields a + string like 10 + minutes. + + basename: Any text, but + most useful for the files keyword and its + relatives. Treat the text as a path, and return the + basename. For example, + foo/bar/baz becomes + baz. + + date: date keyword. Render a + date in a similar format to the Unix date command, but with + timezone included. Yields a string like Mon + Sep 04 15:13:13 2006 -0700. + + domain: Any text, + but most useful for the author keyword. Finds + the first string that looks like an email address, and + extract just the domain component. For example, + Bryan O'Sullivan + <bos@serpentine.com> becomes + serpentine.com. + + email: Any text, + but most useful for the author keyword. Extract + the first string that looks like an email address. For + example, Bryan O'Sullivan + <bos@serpentine.com> becomes + bos@serpentine.com. + + escape: Any text. + Replace the special XML/XHTML characters + &, + < and + > with XML + entities. + + fill68: Any text. Wrap + the text to fit in 68 columns. This is useful before you + pass text through the tabindent filter, and + still want it to fit in an 80-column fixed-font + window. + + fill76: Any text. Wrap + the text to fit in 76 columns. + + firstline: Any text. + Yield the first line of text, without any trailing + newlines. + + hgdate: date keyword. Render + the date as a pair of readable numbers. Yields a string + like 1157407993 + 25200. + + isodate: date keyword. Render + the date as a text string in ISO 8601 format. Yields a + string like 2006-09-04 15:13:13 + -0700. + + obfuscate: Any text, but + most useful for the author keyword. Yield + the input text rendered as a sequence of XML entities. This + helps to defeat some particularly stupid screen-scraping + email harvesting spambots. + + person: Any text, + but most useful for the author keyword. Yield + the text before an email address. For example, + Bryan O'Sullivan + <bos@serpentine.com> becomes + Bryan O'Sullivan. + + rfc822date: + date keyword. + Render a date using the same format used in email headers. + Yields a string like Mon, 04 Sep 2006 + 15:13:13 -0700. + + short: Changeset + hash. Yield the short form of a changeset hash, i.e. a + 12-character hexadecimal string. + + shortdate: date keyword. Render + the year, month, and day of the date. Yields a string like + 2006-09-04. + + strip: + Any text. Strip all leading and trailing whitespace from + the string. + + tabindent: Any text. + Yield the text, with every line except the first starting + with a tab character. + + urlescape: Any text. + Escape all characters that are considered + special by URL parsers. For example, + foo bar becomes + foo%20bar. + + user: Any text, + but most useful for the author keyword. Return + the user portion of an email address. For + example, Bryan O'Sullivan + <bos@serpentine.com> becomes + bos. + + + + &interaction.template.simple.manyfilters; + + + If you try to apply a filter to a piece of data that it + cannot process, Mercurial will fail and print a Python + exception. For example, trying to run the output of the + desc keyword into + the isodate + filter is not a good idea. + + + + Combining filters + + It is easy to combine filters to yield output in the form + you would like. The following chain of filters tidies up a + description, then makes sure that it fits cleanly into 68 + columns, then indents it by a further 8 characters (at least + on Unix-like systems, where a tab is conventionally 8 + characters wide). + + &interaction.template.simple.combine; + + Note the use of \t (a + tab character) in the template to force the first line to be + indented; this is necessary since tabindent indents all + lines except the first. + + Keep in mind that the order of filters in a chain is + significant. The first filter is applied to the result of the + keyword; the second to the result of the first filter; and so + on. For example, using fill68|tabindent + gives very different results from + tabindent|fill68. + + + + + From templates to styles + + A command line template provides a quick and simple way to + format some output. Templates can become verbose, though, and + it's useful to be able to give a template a name. A style file + is a template with a name, stored in a file. + + More than that, using a style file unlocks the power of + Mercurial's templating engine in ways that are not possible + using the command line option. + + + The simplest of style files + + Our simple style file contains just one line: + + &interaction.template.simple.rev; + + This tells Mercurial, if you're printing a + changeset, use the text on the right as the + template. + + + + Style file syntax + + The syntax rules for a style file are simple. + + + The file is processed one line at a + time. + + Leading and trailing white space are + ignored. + + Empty lines are skipped. + + If a line starts with either of the characters + # or + ;, the entire line is + treated as a comment, and skipped as if empty. + + A line starts with a keyword. This must start + with an alphabetic character or underscore, and can + subsequently contain any alphanumeric character or + underscore. (In regexp notation, a keyword must match + [A-Za-z_][A-Za-z0-9_]*.) + + The next element must be an + = character, which can + be preceded or followed by an arbitrary amount of white + space. + + If the rest of the line starts and ends with + matching quote characters (either single or double quote), + it is treated as a template body. + + If the rest of the line does + not start with a quote character, it is + treated as the name of a file; the contents of this file + will be read and used as a template body. + + + + + + Style files by example + + To illustrate how to write a style file, we will construct a + few by example. Rather than provide a complete style file and + walk through it, we'll mirror the usual process of developing a + style file by starting with something very simple, and walking + through a series of successively more complete examples. + + + Identifying mistakes in style files + + If Mercurial encounters a problem in a style file you are + working on, it prints a terse error message that, once you + figure out what it means, is actually quite useful. + +&interaction.template.svnstyle.syntax.input; + + Notice that broken.style attempts to + define a changeset keyword, but forgets to + give any content for it. When instructed to use this style + file, Mercurial promptly complains. + + &interaction.template.svnstyle.syntax.error; + + This error message looks intimidating, but it is not too + hard to follow. + + + The first component is simply Mercurial's way + of saying I am giving up. + ___abort___: broken.style:1: parse error + + Next comes the name of the style file that + contains the error. + abort: ___broken.style___:1: parse error + + Following the file name is the line number + where the error was encountered. + abort: broken.style:___1___: parse error + + Finally, a description of what went + wrong. + abort: broken.style:1: ___parse error___ + + The description of the problem is not always + clear (as in this case), but even when it is cryptic, it + is almost always trivial to visually inspect the offending + line in the style file and see what is wrong. + + + + + + Uniquely identifying a repository + + If you would like to be able to identify a Mercurial + repository fairly uniquely using a short string + as an identifier, you can use the first revision in the + repository. + + &interaction.template.svnstyle.id; + + This is likely to be unique, and so it is + useful in many cases. There are a few caveats. + + It will not work in a completely empty + repository, because such a repository does not have a + revision zero. + + Neither will it work in the (extremely rare) + case where a repository is a merge of two or more formerly + independent repositories, and you still have those + repositories around. + + Here are some uses to which you could put this + identifier: + + As a key into a table for a database that + manages repositories on a server. + + As half of a {repository + ID, revision ID} tuple. + Save this information away when you run an automated build + or other activity, so that you can replay + the build later if necessary. + + + + + + Listing files on multiple lines + + Suppose we want to list the files changed by a changeset, + one per line, with a little indentation before each file + name. + + &interaction.ch10-multiline.go; + + + + Mimicking Subversion's output + + Let's try to emulate the default output format used by + another revision control tool, Subversion. + + &interaction.template.svnstyle.short; + + Since Subversion's output style is fairly simple, it is + easy to copy-and-paste a hunk of its output into a file, and + replace the text produced above by Subversion with the + template values we'd like to see expanded. + + &interaction.template.svnstyle.template; + + There are a few small ways in which this template deviates + from the output produced by Subversion. + + Subversion prints a readable + date (the Wed, 27 Sep 2006 in the + example output above) in parentheses. Mercurial's + templating engine does not provide a way to display a date + in this format without also printing the time and time + zone. + + We emulate Subversion's printing of + separator lines full of + - characters by ending + the template with such a line. We use the templating + engine's header + keyword to print a separator line as the first line of + output (see below), thus achieving similar output to + Subversion. + + Subversion's output includes a count in the + header of the number of lines in the commit message. We + cannot replicate this in Mercurial; the templating engine + does not currently provide a filter that counts the number + of lines the template generates. + + It took me no more than a minute or two of work to replace + literal text from an example of Subversion's output with some + keywords and filters to give the template above. The style + file simply refers to the template. + + &interaction.template.svnstyle.style; + + We could have included the text of the template file + directly in the style file by enclosing it in quotes and + replacing the newlines with + \n sequences, but it would + have made the style file too difficult to read. Readability + is a good guide when you're trying to decide whether some text + belongs in a style file, or in a template file that the style + file points to. If the style file will look too big or + cluttered if you insert a literal piece of text, drop it into + a template instead. + + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch12-mq-collab.xml --- a/en/ch12-mq-collab.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,518 +0,0 @@ - - - - - Advanced uses of Mercurial Queues - - While it's easy to pick up straightforward uses of Mercurial - Queues, use of a little discipline and some of MQ's less - frequently used capabilities makes it possible to work in - complicated development environments. - - In this chapter, I will use as an example a technique I have - used to manage the development of an Infiniband device driver for - the Linux kernel. The driver in question is large (at least as - drivers go), with 25,000 lines of code spread across 35 source - files. It is maintained by a small team of developers. - - While much of the material in this chapter is specific to - Linux, the same principles apply to any code base for which you're - not the primary owner, and upon which you need to do a lot of - development. - - - The problem of many targets - - The Linux kernel changes rapidly, and has never been - internally stable; developers frequently make drastic changes - between releases. This means that a version of the driver that - works well with a particular released version of the kernel will - not even compile correctly against, - typically, any other version. - - To maintain a driver, we have to keep a number of distinct - versions of Linux in mind. - - One target is the main Linux kernel development - tree. Maintenance of the code is in this case partly shared - by other developers in the kernel community, who make - drive-by modifications to the driver as they - develop and refine kernel subsystems. - - We also maintain a number of - backports to older versions of the Linux - kernel, to support the needs of customers who are running - older Linux distributions that do not incorporate our - drivers. (To backport a piece of code - is to modify it to work in an older version of its target - environment than the version it was developed for.) - - Finally, we make software releases on a schedule - that is necessarily not aligned with those used by Linux - distributors and kernel developers, so that we can deliver - new features to customers without forcing them to upgrade - their entire kernels or distributions. - - - - Tempting approaches that don't work well - - There are two standard ways to maintain a - piece of software that has to target many different - environments. - - The first is to maintain a number of branches, each - intended for a single target. The trouble with this approach - is that you must maintain iron discipline in the flow of - changes between repositories. A new feature or bug fix must - start life in a pristine repository, then - percolate out to every backport repository. Backport changes - are more limited in the branches they should propagate to; a - backport change that is applied to a branch where it doesn't - belong will probably stop the driver from compiling. - - The second is to maintain a single source tree filled with - conditional statements that turn chunks of code on or off - depending on the intended target. Because these - ifdefs are not allowed in the Linux kernel - tree, a manual or automatic process must be followed to strip - them out and yield a clean tree. A code base maintained in - this fashion rapidly becomes a rat's nest of conditional - blocks that are difficult to understand and maintain. - - Neither of these approaches is well suited to a situation - where you don't own the canonical copy of a - source tree. In the case of a Linux driver that is - distributed with the standard kernel, Linus's tree contains - the copy of the code that will be treated by the world as - canonical. The upstream version of my driver - can be modified by people I don't know, without me even - finding out about it until after the changes show up in - Linus's tree. - - These approaches have the added weakness of making it - difficult to generate well-formed patches to submit - upstream. - - In principle, Mercurial Queues seems like a good candidate - to manage a development scenario such as the above. While - this is indeed the case, MQ contains a few added features that - make the job more pleasant. - - - - - Conditionally applying patches with guards - - Perhaps the best way to maintain sanity with so many targets - is to be able to choose specific patches to apply for a given - situation. MQ provides a feature called guards - (which originates with quilt's guards - command) that does just this. To start off, let's create a - simple repository for experimenting in. - - &interaction.mq.guards.init; - - This gives us a tiny repository that contains two patches - that don't have any dependencies on each other, because they - touch different files. - - The idea behind conditional application is that you can - tag a patch with a guard, - which is simply a text string of your choosing, then tell MQ to - select specific guards to use when applying patches. MQ will - then either apply, or skip over, a guarded patch, depending on - the guards that you have selected. - - A patch can have an arbitrary number of guards; each one is - positive (apply this patch if this - guard is selected) or negative - (skip this patch if this guard is selected). A - patch with no guards is always applied. - - - - Controlling the guards on a patch - - The qguard command lets - you determine which guards should apply to a patch, or display - the guards that are already in effect. Without any arguments, it - displays the guards on the current topmost patch. - - &interaction.mq.guards.qguard; - - To set a positive guard on a patch, prefix the name of the - guard with a +. - - &interaction.mq.guards.qguard.pos; - - To set a negative guard - on a patch, prefix the name of the guard with a - -. - - &interaction.mq.guards.qguard.neg; - - - The qguard command - sets the guards on a patch; it doesn't - modify them. What this means is that if - you run hg qguard +a +b on a - patch, then hg qguard +c on - the same patch, the only guard that will - be set on it afterwards is +c. - - - Mercurial stores guards in the series file; the form in which they - are stored is easy both to understand and to edit by hand. (In - other words, you don't have to use the qguard command if you don't want - to; it's okay to simply edit the series file.) - - &interaction.mq.guards.series; - - - - Selecting the guards to use - - The qselect command - determines which guards are active at a given time. The effect - of this is to determine which patches MQ will apply the next - time you run qpush. It has - no other effect; in particular, it doesn't do anything to - patches that are already applied. - - With no arguments, the qselect command lists the guards - currently in effect, one per line of output. Each argument is - treated as the name of a guard to apply. - - &interaction.mq.guards.qselect.foo; - - In case you're interested, the currently selected guards are - stored in the guards file. - - &interaction.mq.guards.qselect.cat; - - We can see the effect the selected guards have when we run - qpush. - - &interaction.mq.guards.qselect.qpush; - - A guard cannot start with a - + or - - character. The name of a - guard must not contain white space, but most other characters - are acceptable. If you try to use a guard with an invalid name, - MQ will complain: - - &interaction.mq.guards.qselect.error; - - Changing the selected guards changes the patches that are - applied. - - &interaction.mq.guards.qselect.quux; - - You can see in the example below that negative guards take - precedence over positive guards. - - &interaction.mq.guards.qselect.foobar; - - - - MQ's rules for applying patches - - The rules that MQ uses when deciding whether to apply a - patch are as follows. - - A patch that has no guards is always - applied. - - If the patch has any negative guard that matches - any currently selected guard, the patch is skipped. - - If the patch has any positive guard that matches - any currently selected guard, the patch is applied. - - If the patch has positive or negative guards, - but none matches any currently selected guard, the patch is - skipped. - - - - - Trimming the work environment - - In working on the device driver I mentioned earlier, I don't - apply the patches to a normal Linux kernel tree. Instead, I use - a repository that contains only a snapshot of the source files - and headers that are relevant to Infiniband development. This - repository is 1% the size of a kernel repository, so it's easier - to work with. - - I then choose a base version on top of which - the patches are applied. This is a snapshot of the Linux kernel - tree as of a revision of my choosing. When I take the snapshot, - I record the changeset ID from the kernel repository in the - commit message. Since the snapshot preserves the - shape and content of the relevant parts of the - kernel tree, I can apply my patches on top of either my tiny - repository or a normal kernel tree. - - Normally, the base tree atop which the patches apply should - be a snapshot of a very recent upstream tree. This best - facilitates the development of patches that can easily be - submitted upstream with few or no modifications. - - - - Dividing up the <filename role="special">series</filename> - file - - I categorise the patches in the series file into a number of logical - groups. Each section of like patches begins with a block of - comments that describes the purpose of the patches that - follow. - - The sequence of patch groups that I maintain follows. The - ordering of these groups is important; I'll describe why after I - introduce the groups. - - The accepted group. Patches that - the development team has submitted to the maintainer of the - Infiniband subsystem, and which he has accepted, but which - are not present in the snapshot that the tiny repository is - based on. These are read only patches, - present only to transform the tree into a similar state as - it is in the upstream maintainer's repository. - - The rework group. Patches that I - have submitted, but that the upstream maintainer has - requested modifications to before he will accept - them. - - The pending group. Patches that - I have not yet submitted to the upstream maintainer, but - which we have finished working on. These will be read - only for a while. If the upstream maintainer - accepts them upon submission, I'll move them to the end of - the accepted group. If he requests that I - modify any, I'll move them to the beginning of the - rework group. - - The in progress group. Patches - that are actively being developed, and should not be - submitted anywhere yet. - - The backport group. Patches that - adapt the source tree to older versions of the kernel - tree. - - The do not ship group. Patches - that for some reason should never be submitted upstream. - For example, one such patch might change embedded driver - identification strings to make it easier to distinguish, in - the field, between an out-of-tree version of the driver and - a version shipped by a distribution vendor. - - - Now to return to the reasons for ordering groups of patches - in this way. We would like the lowest patches in the stack to - be as stable as possible, so that we will not need to rework - higher patches due to changes in context. Putting patches that - will never be changed first in the series file serves this - purpose. - - We would also like the patches that we know we'll need to - modify to be applied on top of a source tree that resembles the - upstream tree as closely as possible. This is why we keep - accepted patches around for a while. - - The backport and do not ship - patches float at the end of the series file. The backport patches - must be applied on top of all other patches, and the do - not ship patches might as well stay out of harm's - way. - - - - Maintaining the patch series - - In my work, I use a number of guards to control which - patches are to be applied. - - - Accepted patches are guarded with - accepted. I enable this guard most of - the time. When I'm applying the patches on top of a tree - where the patches are already present, I can turn this patch - off, and the patches that follow it will apply - cleanly. - - Patches that are finished, but - not yet submitted, have no guards. If I'm applying the - patch stack to a copy of the upstream tree, I don't need to - enable any guards in order to get a reasonably safe source - tree. - - Those patches that need reworking before being - resubmitted are guarded with - rework. - - For those patches that are still under - development, I use devel. - - A backport patch may have several guards, one - for each version of the kernel to which it applies. For - example, a patch that backports a piece of code to 2.6.9 - will have a 2.6.9 guard. - - This variety of guards gives me considerable flexibility in - determining what kind of source tree I want to end up with. For - most situations, the selection of appropriate guards is - automated during the build process, but I can manually tune the - guards to use for less common circumstances. - - - The art of writing backport patches - - Using MQ, writing a backport patch is a simple process. - All such a patch has to do is modify a piece of code that uses - a kernel feature not present in the older version of the - kernel, so that the driver continues to work correctly under - that older version. - - A useful goal when writing a good backport patch is to - make your code look as if it was written for the older version - of the kernel you're targeting. The less obtrusive the patch, - the easier it will be to understand and maintain. If you're - writing a collection of backport patches to avoid the - rat's nest effect of lots of - #ifdefs (hunks of source code that are only - used conditionally) in your code, don't introduce - version-dependent #ifdefs into the patches. - Instead, write several patches, each of which makes - unconditional changes, and control their application using - guards. - - There are two reasons to divide backport patches into a - distinct group, away from the regular patches - whose effects they modify. The first is that intermingling the - two makes it more difficult to use a tool like the patchbomb extension to automate the - process of submitting the patches to an upstream maintainer. - The second is that a backport patch could perturb the context - in which a subsequent regular patch is applied, making it - impossible to apply the regular patch cleanly - without the earlier backport patch - already being applied. - - - - - Useful tips for developing with MQ - - - Organising patches in directories - - If you're working on a substantial project with MQ, it's - not difficult to accumulate a large number of patches. For - example, I have one patch repository that contains over 250 - patches. - - If you can group these patches into separate logical - categories, you can if you like store them in different - directories; MQ has no problems with patch names that contain - path separators. - - - - Viewing the history of a patch - - If you're developing a set of patches over a long time, - it's a good idea to maintain them in a repository, as - discussed in . If you do - so, you'll quickly - discover that using the hg - diff command to look at the history of changes to - a patch is unworkable. This is in part because you're looking - at the second derivative of the real code (a diff of a diff), - but also because MQ adds noise to the process by modifying - time stamps and directory names when it updates a - patch. - - However, you can use the extdiff extension, which is bundled - with Mercurial, to turn a diff of two versions of a patch into - something readable. To do this, you will need a third-party - package called patchutils - web:patchutils. This provides a command - named interdiff, which shows the - differences between two diffs as a diff. Used on two versions - of the same diff, it generates a diff that represents the diff - from the first to the second version. - - You can enable the extdiff extension in the usual way, - by adding a line to the extensions section of your - ~/.hgrc. - [extensions] -extdiff = - The interdiff command expects to be - passed the names of two files, but the extdiff extension passes the program - it runs a pair of directories, each of which can contain an - arbitrary number of files. We thus need a small program that - will run interdiff on each pair of files in - these two directories. This program is available as hg-interdiff in the examples directory of the - source code repository that accompanies this book. - - With the hg-interdiff - program in your shell's search path, you can run it as - follows, from inside an MQ patch directory: - hg extdiff -p hg-interdiff -r A:B my-change.patch - Since you'll probably want to use this long-winded command - a lot, you can get hgext to - make it available as a normal Mercurial command, again by - editing your ~/.hgrc. - [extdiff] -cmd.interdiff = hg-interdiff - This directs hgext to - make an interdiff command available, so you - can now shorten the previous invocation of extdiff to something a - little more wieldy. - hg interdiff -r A:B my-change.patch - - - The interdiff command works well - only if the underlying files against which versions of a - patch are generated remain the same. If you create a patch, - modify the underlying files, and then regenerate the patch, - interdiff may not produce useful - output. - - - The extdiff extension is - useful for more than merely improving the presentation of MQ - patches. To read more about it, go to . - - - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch12-mq.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch12-mq.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,1368 @@ + + + + + Managing change with Mercurial Queues + + + The patch management problem + + Here is a common scenario: you need to install a software + package from source, but you find a bug that you must fix in the + source before you can start using the package. You make your + changes, forget about the package for a while, and a few months + later you need to upgrade to a newer version of the package. If + the newer version of the package still has the bug, you must + extract your fix from the older source tree and apply it against + the newer version. This is a tedious task, and it's easy to + make mistakes. + + This is a simple case of the patch management + problem. You have an upstream source tree that + you can't change; you need to make some local changes on top of + the upstream tree; and you'd like to be able to keep those + changes separate, so that you can apply them to newer versions + of the upstream source. + + The patch management problem arises in many situations. + Probably the most visible is that a user of an open source + software project will contribute a bug fix or new feature to the + project's maintainers in the form of a patch. + + Distributors of operating systems that include open source + software often need to make changes to the packages they + distribute so that they will build properly in their + environments. + + When you have few changes to maintain, it is easy to manage + a single patch using the standard diff and + patch programs (see for a discussion of these + tools). Once the number of changes grows, it starts to make + sense to maintain patches as discrete chunks of + work, so that for example a single patch will contain + only one bug fix (the patch might modify several files, but it's + doing only one thing), and you may have a number + of such patches for different bugs you need fixed and local + changes you require. In this situation, if you submit a bug fix + patch to the upstream maintainers of a package and they include + your fix in a subsequent release, you can simply drop that + single patch when you're updating to the newer release. + + Maintaining a single patch against an upstream tree is a + little tedious and error-prone, but not difficult. However, the + complexity of the problem grows rapidly as the number of patches + you have to maintain increases. With more than a tiny number of + patches in hand, understanding which ones you have applied and + maintaining them moves from messy to overwhelming. + + Fortunately, Mercurial includes a powerful extension, + Mercurial Queues (or simply MQ), that massively + simplifies the patch management problem. + + + + The prehistory of Mercurial Queues + + During the late 1990s, several Linux kernel developers + started to maintain patch series that modified + the behavior of the Linux kernel. Some of these series were + focused on stability, some on feature coverage, and others were + more speculative. + + The sizes of these patch series grew rapidly. In 2002, + Andrew Morton published some shell scripts he had been using to + automate the task of managing his patch queues. Andrew was + successfully using these scripts to manage hundreds (sometimes + thousands) of patches on top of the Linux kernel. + + + A patchwork quilt + + In early 2003, Andreas Gruenbacher and Martin Quinson + borrowed the approach of Andrew's scripts and published a tool + called patchwork quilt + web:quilt, or simply quilt + (see gruenbacher:2005 for a paper + describing it). Because quilt substantially automated patch + management, it rapidly gained a large following among open + source software developers. + + Quilt manages a stack of patches on + top of a directory tree. To begin, you tell quilt to manage a + directory tree, and tell it which files you want to manage; it + stores away the names and contents of those files. To fix a + bug, you create a new patch (using a single command), edit the + files you need to fix, then refresh the + patch. + + The refresh step causes quilt to scan the directory tree; + it updates the patch with all of the changes you have made. + You can create another patch on top of the first, which will + track the changes required to modify the tree from tree + with one patch applied to tree with two + patches applied. + + You can change which patches are + applied to the tree. If you pop a patch, the + changes made by that patch will vanish from the directory + tree. Quilt remembers which patches you have popped, though, + so you can push a popped patch again, and the + directory tree will be restored to contain the modifications + in the patch. Most importantly, you can run the + refresh command at any time, and the topmost + applied patch will be updated. This means that you can, at + any time, change both which patches are applied and what + modifications those patches make. + + Quilt knows nothing about revision control tools, so it + works equally well on top of an unpacked tarball or a + Subversion working copy. + + + + From patchwork quilt to Mercurial Queues + + In mid-2005, Chris Mason took the features of quilt and + wrote an extension that he called Mercurial Queues, which + added quilt-like behavior to Mercurial. + + The key difference between quilt and MQ is that quilt + knows nothing about revision control systems, while MQ is + integrated into Mercurial. Each patch + that you push is represented as a Mercurial changeset. Pop a + patch, and the changeset goes away. + + Because quilt does not care about revision control tools, + it is still a tremendously useful piece of software to know + about for situations where you cannot use Mercurial and + MQ. + + + + + The huge advantage of MQ + + I cannot overstate the value that MQ offers through the + unification of patches and revision control. + + A major reason that patches have persisted in the free + software and open source world&emdash;in spite of the + availability of increasingly capable revision control tools over + the years&emdash;is the agility they + offer. + + Traditional revision control tools make a permanent, + irreversible record of everything that you do. While this has + great value, it's also somewhat stifling. If you want to + perform a wild-eyed experiment, you have to be careful in how + you go about it, or you risk leaving unneeded&emdash;or worse, + misleading or destabilising&emdash;traces of your missteps and + errors in the permanent revision record. + + By contrast, MQ's marriage of distributed revision control + with patches makes it much easier to isolate your work. Your + patches live on top of normal revision history, and you can make + them disappear or reappear at will. If you don't like a patch, + you can drop it. If a patch isn't quite as you want it to be, + simply fix it&emdash;as many times as you need to, until you + have refined it into the form you desire. + + As an example, the integration of patches with revision + control makes understanding patches and debugging their + effects&emdash;and their interplay with the code they're based + on&emdash;enormously easier. Since every + applied patch has an associated changeset, you can give hg log a file name to see which + changesets and patches affected the file. You can use the + hg bisect command to + binary-search through all changesets and applied patches to see + where a bug got introduced or fixed. You can use the hg annotate command to see which + changeset or patch modified a particular line of a source file. + And so on. + + + + Understanding patches + + Because MQ doesn't hide its patch-oriented nature, it is + helpful to understand what patches are, and a little about the + tools that work with them. + + The traditional Unix diff command + compares two files, and prints a list of differences between + them. The patch command understands these + differences as modifications to make to a + file. Take a look below for a simple example of these commands + in action. + + &interaction.mq.dodiff.diff; + + The type of file that diff generates (and + patch takes as input) is called a + patch or a diff; there is no + difference between a patch and a diff. (We'll use the term + patch, since it's more commonly used.) + + A patch file can start with arbitrary text; the + patch command ignores this text, but MQ uses + it as the commit message when creating changesets. To find the + beginning of the patch content, patch + searches for the first line that starts with the string + diff -. + + MQ works with unified diffs + (patch can accept several other diff formats, + but MQ doesn't). A unified diff contains two kinds of header. + The file header describes the file being + modified; it contains the name of the file to modify. When + patch sees a new file header, it looks for a + file with that name to start modifying. + + After the file header comes a series of + hunks. Each hunk starts with a header; + this identifies the range of line numbers within the file that + the hunk should modify. Following the header, a hunk starts and + ends with a few (usually three) lines of text from the + unmodified file; these are called the + context for the hunk. If there's only a + small amount of context between successive hunks, + diff doesn't print a new hunk header; it just + runs the hunks together, with a few lines of context between + modifications. + + Each line of context begins with a space character. Within + the hunk, a line that begins with + - means remove this + line, while a line that begins with + + means insert this + line. For example, a line that is modified is + represented by one deletion and one insertion. + + We will return to some of the more subtle aspects of patches + later (in ), but you + should have + enough information now to use MQ. + + + + Getting started with Mercurial Queues + + Because MQ is implemented as an extension, you must + explicitly enable before you can use it. (You don't need to + download anything; MQ ships with the standard Mercurial + distribution.) To enable MQ, edit your ~/.hgrc file, and add the lines + below. + + [extensions] +hgext.mq = + + Once the extension is enabled, it will make a number of new + commands available. To verify that the extension is working, + you can use hg help to see if + the qinit command is now + available. + + &interaction.mq.qinit-help.help; + + You can use MQ with any Mercurial + repository, and its commands only operate within that + repository. To get started, simply prepare the repository using + the qinit command. + + &interaction.mq.tutorial.qinit; + + This command creates an empty directory called .hg/patches, where + MQ will keep its metadata. As with many Mercurial commands, the + qinit command prints nothing + if it succeeds. + + + Creating a new patch + + To begin work on a new patch, use the qnew command. This command takes + one argument, the name of the patch to create. + + MQ will use this as the name of an actual file in the + .hg/patches directory, as you + can see below. + + &interaction.mq.tutorial.qnew; + + Also newly present in the .hg/patches directory are two + other files, series and + status. The series file lists all of the + patches that MQ knows about for this repository, with one + patch per line. Mercurial uses the status file for internal + book-keeping; it tracks all of the patches that MQ has + applied in this repository. + + + You may sometimes want to edit the series file by hand; for + example, to change the sequence in which some patches are + applied. However, manually editing the status file is almost always a + bad idea, as it's easy to corrupt MQ's idea of what is + happening. + + + Once you have created your new patch, you can edit files + in the working directory as you usually would. All of the + normal Mercurial commands, such as hg + diff and hg + annotate, work exactly as they did before. + + + + Refreshing a patch + + When you reach a point where you want to save your work, + use the qrefresh command + to update the patch you are working on. + + &interaction.mq.tutorial.qrefresh; + + This command folds the changes you have made in the + working directory into your patch, and updates its + corresponding changeset to contain those changes. + + You can run qrefresh + as often as you like, so it's a good way to + checkpoint your work. Refresh your patch at an + opportune time; try an experiment; and if the experiment + doesn't work out, hg revert + your modifications back to the last time you refreshed. + + &interaction.mq.tutorial.qrefresh2; + + + + Stacking and tracking patches + + Once you have finished working on a patch, or need to work + on another, you can use the qnew command again to create a + new patch. Mercurial will apply this patch on top of your + existing patch. + + &interaction.mq.tutorial.qnew2; + + Notice that the patch contains the changes in our prior + patch as part of its context (you can see this more clearly in + the output of hg + annotate). + + So far, with the exception of qnew and qrefresh, we've been careful to + only use regular Mercurial commands. However, MQ provides + many commands that are easier to use when you are thinking + about patches, as illustrated below. + + &interaction.mq.tutorial.qseries; + + + The qseries command lists every + patch that MQ knows about in this repository, from oldest + to newest (most recently + created). + + The qapplied command lists every + patch that MQ has applied in this + repository, again from oldest to newest (most recently + applied). + + + + + Manipulating the patch stack + + The previous discussion implied that there must be a + difference between known and + applied patches, and there is. MQ can manage a + patch without it being applied in the repository. + + An applied patch has a corresponding + changeset in the repository, and the effects of the patch and + changeset are visible in the working directory. You can undo + the application of a patch using the qpop command. MQ still + knows about, or manages, a popped patch, + but the patch no longer has a corresponding changeset in the + repository, and the working directory does not contain the + changes made by the patch. illustrates + the difference between applied and tracked patches. + +
+ Applied and unapplied patches in the MQ patch + stack + + + XXX add text + +
+ + You can reapply an unapplied, or popped, patch using the + qpush command. This + creates a new changeset to correspond to the patch, and the + patch's changes once again become present in the working + directory. See below for examples of qpop and qpush in action. + + &interaction.mq.tutorial.qpop; + + Notice that once we have popped a patch or two patches, + the output of qseries + remains the same, while that of qapplied has changed. + +
+ + + Pushing and popping many patches + + While qpush and + qpop each operate on a + single patch at a time by default, you can push and pop many + patches in one go. The option to + qpush causes it to push + all unapplied patches, while the option to qpop causes it to pop all applied + patches. (For some more ways to push and pop many patches, + see below.) + + &interaction.mq.tutorial.qpush-a; + + + + Safety checks, and overriding them + + Several MQ commands check the working directory before + they do anything, and fail if they find any modifications. + They do this to ensure that you won't lose any changes that + you have made, but not yet incorporated into a patch. The + example below illustrates this; the qnew command will not create a + new patch if there are outstanding changes, caused in this + case by the hg add of + file3. + + &interaction.mq.tutorial.add; + + Commands that check the working directory all take an + I know what I'm doing option, which is always + named . The exact meaning of + depends on the command. For example, + hg qnew + will incorporate any outstanding changes into the new patch it + creates, but hg qpop + will revert modifications to any files affected by the patch + that it is popping. Be sure to read the documentation for a + command's option before you use it! + + + + Working on several patches at once + + The qrefresh command + always refreshes the topmost applied + patch. This means that you can suspend work on one patch (by + refreshing it), pop or push to make a different patch the top, + and work on that patch for a + while. + + Here's an example that illustrates how you can use this + ability. Let's say you're developing a new feature as two + patches. The first is a change to the core of your software, + and the second&emdash;layered on top of the + first&emdash;changes the user interface to use the code you + just added to the core. If you notice a bug in the core while + you're working on the UI patch, it's easy to fix the core. + Simply qrefresh the UI + patch to save your in-progress changes, and qpop down to the core patch. Fix + the core bug, qrefresh the + core patch, and qpush back + to the UI patch to continue where you left off. + +
+ + + More about patches + + MQ uses the GNU patch command to apply + patches, so it's helpful to know a few more detailed aspects of + how patch works, and about patches + themselves. + + + The strip count + + If you look at the file headers in a patch, you will + notice that the pathnames usually have an extra component on + the front that isn't present in the actual path name. This is + a holdover from the way that people used to generate patches + (people still do this, but it's somewhat rare with modern + revision control tools). + + Alice would unpack a tarball, edit her files, then decide + that she wanted to create a patch. So she'd rename her + working directory, unpack the tarball again (hence the need + for the rename), and use the and options to + diff to recursively generate a patch + between the unmodified directory and the modified one. The + result would be that the name of the unmodified directory + would be at the front of the left-hand path in every file + header, and the name of the modified directory would be at the + front of the right-hand path. + + Since someone receiving a patch from the Alices of the net + would be unlikely to have unmodified and modified directories + with exactly the same names, the patch + command has a option + that indicates the number of leading path name components to + strip when trying to apply a patch. This number is called the + strip count. + + An option of -p1 means + use a strip count of one. If + patch sees a file name + foo/bar/baz in a file header, it will + strip foo and try to patch a file named + bar/baz. (Strictly speaking, the strip + count refers to the number of path + separators (and the components that go with them + ) to strip. A strip count of one will turn + foo/bar into bar, + but /foo/bar (notice the extra leading + slash) into foo/bar.) + + The standard strip count for patches is + one; almost all patches contain one leading path name + component that needs to be stripped. Mercurial's hg diff command generates path names + in this form, and the hg + import command and MQ expect patches to have a + strip count of one. + + If you receive a patch from someone that you want to add + to your patch queue, and the patch needs a strip count other + than one, you cannot just qimport the patch, because + qimport does not yet have + a -p option (see issue + 311). Your best bet is to qnew a patch of your own, then + use patch -pN to apply their patch, + followed by hg addremove to + pick up any files added or removed by the patch, followed by + hg qrefresh. This + complexity may become unnecessary; see issue + 311 for details. + + + + + Strategies for applying a patch + + When patch applies a hunk, it tries a + handful of successively less accurate strategies to try to + make the hunk apply. This falling-back technique often makes + it possible to take a patch that was generated against an old + version of a file, and apply it against a newer version of + that file. + + First, patch tries an exact match, + where the line numbers, the context, and the text to be + modified must apply exactly. If it cannot make an exact + match, it tries to find an exact match for the context, + without honouring the line numbering information. If this + succeeds, it prints a line of output saying that the hunk was + applied, but at some offset from the + original line number. + + If a context-only match fails, patch + removes the first and last lines of the context, and tries a + reduced context-only match. If the hunk + with reduced context succeeds, it prints a message saying that + it applied the hunk with a fuzz factor + (the number after the fuzz factor indicates how many lines of + context patch had to trim before the patch + applied). + + When neither of these techniques works, + patch prints a message saying that the hunk + in question was rejected. It saves rejected hunks (also + simply called rejects) to a file with the same + name, and an added .rej + extension. It also saves an unmodified copy of the file with + a .orig extension; the + copy of the file without any extensions will contain any + changes made by hunks that did apply + cleanly. If you have a patch that modifies + foo with six hunks, and one of them fails + to apply, you will have: an unmodified + foo.orig, a foo.rej + containing one hunk, and foo, containing + the changes made by the five successful hunks. + + + + Some quirks of patch representation + + There are a few useful things to know about how + patch works with files. + + This should already be obvious, but + patch cannot handle binary + files. + + Neither does it care about the executable bit; + it creates new files as readable, but not + executable. + + patch treats the removal of + a file as a diff between the file to be removed and the + empty file. So your idea of I deleted this + file looks like every line of this file + was deleted in a patch. + + It treats the addition of a file as a diff + between the empty file and the file to be added. So in a + patch, your idea of I added this file looks + like every line of this file was + added. + + It treats a renamed file as the removal of the + old name, and the addition of the new name. This means + that renamed files have a big footprint in patches. (Note + also that Mercurial does not currently try to infer when + files have been renamed or copied in a patch.) + + patch cannot represent + empty files, so you cannot use a patch to represent the + notion I added this empty file to the + tree. + + + + + Beware the fuzz + + While applying a hunk at an offset, or with a fuzz factor, + will often be completely successful, these inexact techniques + naturally leave open the possibility of corrupting the patched + file. The most common cases typically involve applying a + patch twice, or at an incorrect location in the file. If + patch or qpush ever mentions an offset or + fuzz factor, you should make sure that the modified files are + correct afterwards. + + It's often a good idea to refresh a patch that has applied + with an offset or fuzz factor; refreshing the patch generates + new context information that will make it apply cleanly. I + say often, not always, because + sometimes refreshing a patch will make it fail to apply + against a different revision of the underlying files. In some + cases, such as when you're maintaining a patch that must sit + on top of multiple versions of a source tree, it's acceptable + to have a patch apply with some fuzz, provided you've verified + the results of the patching process in such cases. + + + + Handling rejection + + If qpush fails to + apply a patch, it will print an error message and exit. If it + has left .rej files + behind, it is usually best to fix up the rejected hunks before + you push more patches or do any further work. + + If your patch used to apply cleanly, + and no longer does because you've changed the underlying code + that your patches are based on, Mercurial Queues can help; see + for details. + + Unfortunately, there aren't any great techniques for + dealing with rejected hunks. Most often, you'll need to view + the .rej file and edit the + target file, applying the rejected hunks by hand. + + A Linux kernel hacker, Chris Mason (the author + of Mercurial Queues), wrote a tool called + mpatch (http://oss.oracle.com/~mason/mpatch/), + which takes a simple approach to automating the application of + hunks rejected by patch. The + mpatch command can help with four common + reasons that a hunk may be rejected: + + + The context in the middle of a hunk has + changed. + + A hunk is missing some context at the + beginning or end. + + A large hunk might apply better&emdash;either + entirely or in part&emdash;if it was broken up into + smaller hunks. + + A hunk removes lines with slightly different + content than those currently present in the file. + + + If you use mpatch, you + should be doubly careful to check your results when you're + done. In fact, mpatch enforces this method + of double-checking the tool's output, by automatically + dropping you into a merge program when it has done its job, so + that you can verify its work and finish off any remaining + merges. + + + + + More on patch management + + As you grow familiar with MQ, you will find yourself wanting + to perform other kinds of patch management operations. + + + Deleting unwanted patches + + If you want to get rid of a patch, use the hg qdelete command to delete the + patch file and remove its entry from the patch series. If you + try to delete a patch that is still applied, hg qdelete will refuse. + + &interaction.ch11-qdelete.go; + + + + Converting to and from permanent revisions + + Once you're done working on a patch and want to + turn it into a permanent changeset, use the hg qfinish command. Pass a revision + to the command to identify the patch that you want to turn into + a regular changeset; this patch must already be applied. + + &interaction.ch11-qdelete.convert; + + The hg qfinish command + accepts an or + option, which turns all applied patches into regular + changesets. + + It is also possible to turn an existing changeset into a + patch, by passing the option to hg qimport. + + &interaction.ch11-qdelete.import; + + Note that it only makes sense to convert a changeset into + a patch if you have not propagated that changeset into any + other repositories. The imported changeset's ID will change + every time you refresh the patch, which will make Mercurial + treat it as unrelated to the original changeset if you have + pushed it somewhere else. + + + + + Getting the best performance out of MQ + + MQ is very efficient at handling a large number + of patches. I ran some performance experiments in mid-2006 for a + talk that I gave at the 2006 EuroPython conference (on modern + hardware, you should expect better performance than you'll see + below). I used as my data set the Linux 2.6.17-mm1 patch + series, which consists of 1,738 patches. I applied these on top + of a Linux kernel repository containing all 27,472 revisions + between Linux 2.6.12-rc2 and Linux 2.6.17. + + On my old, slow laptop, I was able to hg qpush all + 1,738 patches in 3.5 minutes, and hg qpop + + them all in 30 seconds. (On a newer laptop, the time to push + all patches dropped to two minutes.) I could qrefresh one of the biggest patches + (which made 22,779 lines of changes to 287 files) in 6.6 + seconds. + + Clearly, MQ is well suited to working in large trees, but + there are a few tricks you can use to get the best performance + of it. + + First of all, try to batch operations + together. Every time you run qpush or qpop, these commands scan the + working directory once to make sure you haven't made some + changes and then forgotten to run qrefresh. On a small tree, the + time that this scan takes is unnoticeable. However, on a + medium-sized tree (containing tens of thousands of files), it + can take a second or more. + + The qpush and qpop commands allow you to push and + pop multiple patches at a time. You can identify the + destination patch that you want to end up at. + When you qpush with a + destination specified, it will push patches until that patch is + at the top of the applied stack. When you qpop to a destination, MQ will pop + patches until the destination patch is at the top. + + You can identify a destination patch using either the name + of the patch, or by number. If you use numeric addressing, + patches are counted from zero; this means that the first patch + is zero, the second is one, and so on. + + + + Updating your patches when the underlying code + changes + + It's common to have a stack of patches on top of an + underlying repository that you don't modify directly. If you're + working on changes to third-party code, or on a feature that is + taking longer to develop than the rate of change of the code + beneath, you will often need to sync up with the underlying + code, and fix up any hunks in your patches that no longer apply. + This is called rebasing your patch + series. + + The simplest way to do this is to hg + qpop your patches, then hg pull changes into the underlying + repository, and finally hg qpush your + patches again. MQ will stop pushing any time it runs across a + patch that fails to apply during conflicts, allowing you to fix + your conflicts, qrefresh the + affected patch, and continue pushing until you have fixed your + entire stack. + + This approach is easy to use and works well if you don't + expect changes to the underlying code to affect how well your + patches apply. If your patch stack touches code that is modified + frequently or invasively in the underlying repository, however, + fixing up rejected hunks by hand quickly becomes + tiresome. + + It's possible to partially automate the rebasing process. + If your patches apply cleanly against some revision of the + underlying repo, MQ can use this information to help you to + resolve conflicts between your patches and a different + revision. + + The process is a little involved. + + To begin, hg qpush + -a all of your patches on top of the revision + where you know that they apply cleanly. + + Save a backup copy of your patch directory using + hg qsave . + This prints the name of the directory that it has saved the + patches in. It will save the patches to a directory called + .hg/patches.N, where + N is a small integer. It also commits a + save changeset on top of your applied + patches; this is for internal book-keeping, and records the + states of the series and + status files. + + Use hg pull to + bring new changes into the underlying repository. (Don't + run hg pull -u; see below + for why.) + + Update to the new tip revision, using hg update to override + the patches you have pushed. + + Merge all patches using hg qpush -m + -a. The option to + qpush tells MQ to + perform a three-way merge if the patch fails to + apply. + + + During the hg qpush , + each patch in the series + file is applied normally. If a patch applies with fuzz or + rejects, MQ looks at the queue you qsaved, and performs a three-way + merge with the corresponding changeset. This merge uses + Mercurial's normal merge machinery, so it may pop up a GUI merge + tool to help you to resolve problems. + + When you finish resolving the effects of a patch, MQ + refreshes your patch based on the result of the merge. + + At the end of this process, your repository will have one + extra head from the old patch queue, and a copy of the old patch + queue will be in .hg/patches.N. You can remove the + extra head using hg qpop -a -n + patches.N or hg + strip. You can delete .hg/patches.N once you are sure + that you no longer need it as a backup. + + + + Identifying patches + + MQ commands that work with patches let you refer to a patch + either by using its name or by a number. By name is obvious + enough; pass the name foo.patch to qpush, for example, and it will + push patches until foo.patch is + applied. + + As a shortcut, you can refer to a patch using both a name + and a numeric offset; foo.patch-2 means + two patches before foo.patch, + while bar.patch+4 means four patches + after bar.patch. + + Referring to a patch by index isn't much different. The + first patch printed in the output of qseries is patch zero (yes, it's + one of those start-at-zero counting systems); the second is + patch one; and so on. + + MQ also makes it easy to work with patches when you are + using normal Mercurial commands. Every command that accepts a + changeset ID will also accept the name of an applied patch. MQ + augments the tags normally in the repository with an eponymous + one for each applied patch. In addition, the special tags + qbase and + qtip identify + the bottom-most and topmost applied patches, + respectively. + + These additions to Mercurial's normal tagging capabilities + make dealing with patches even more of a breeze. + + Want to patchbomb a mailing list with your + latest series of changes? + hg email qbase:qtip + (Don't know what patchbombing is? See + .) + + Need to see all of the patches since + foo.patch that have touched files in a + subdirectory of your tree? + hg log -r foo.patch:qtip subdir + + + + Because MQ makes the names of patches available to the rest + of Mercurial through its normal internal tag machinery, you + don't need to type in the entire name of a patch when you want + to identify it by name. + + Another nice consequence of representing patch names as tags + is that when you run the hg log + command, it will display a patch's name as a tag, simply as part + of its normal output. This makes it easy to visually + distinguish applied patches from underlying + normal revisions. The following example shows a + few normal Mercurial commands in use with applied + patches. + + &interaction.mq.id.output; + + + + Useful things to know about + + There are a number of aspects of MQ usage that don't fit + tidily into sections of their own, but that are good to know. + Here they are, in one place. + + + Normally, when you qpop a patch and qpush it again, the changeset + that represents the patch after the pop/push will have a + different identity than the changeset + that represented the hash beforehand. See for + information as to why this is. + + It's not a good idea to hg merge changes from another + branch with a patch changeset, at least if you want to + maintain the patchiness of that changeset and + changesets below it on the patch stack. If you try to do + this, it will appear to succeed, but MQ will become + confused. + + + + + Managing patches in a repository + + Because MQ's .hg/patches directory resides + outside a Mercurial repository's working directory, the + underlying Mercurial repository knows nothing + about the management or presence of patches. + + This presents the interesting possibility of managing the + contents of the patch directory as a Mercurial repository in its + own right. This can be a useful way to work. For example, you + can work on a patch for a while, qrefresh it, then hg commit the current state of the + patch. This lets you roll back to that version + of the patch later on. + + You can then share different versions of the same patch + stack among multiple underlying repositories. I use this when I + am developing a Linux kernel feature. I have a pristine copy of + my kernel sources for each of several CPU architectures, and a + cloned repository under each that contains the patches I am + working on. When I want to test a change on a different + architecture, I push my current patches to the patch repository + associated with that kernel tree, pop and push all of my + patches, and build and test that kernel. + + Managing patches in a repository makes it possible for + multiple developers to work on the same patch series without + colliding with each other, all on top of an underlying source + base that they may or may not control. + + + MQ support for patch repositories + + MQ helps you to work with the .hg/patches directory as a + repository; when you prepare a repository for working with + patches using qinit, you + can pass the option to create the .hg/patches directory as a + Mercurial repository. + + + If you forget to use the option, you + can simply go into the .hg/patches directory at any + time and run hg init. + Don't forget to add an entry for the status file to the .hgignore file, though + + (hg qinit + does this for you automatically); you + really don't want to manage the + status file. + + + As a convenience, if MQ notices that the .hg/patches directory is a + repository, it will automatically hg + add every patch that you create and import. + + MQ provides a shortcut command, qcommit, that runs hg commit in the .hg/patches + directory. This saves some bothersome typing. + + Finally, as a convenience to manage the patch directory, + you can define the alias mq on Unix + systems. For example, on Linux systems using the + bash shell, you can include the following + snippet in your ~/.bashrc. + + alias mq=`hg -R $(hg root)/.hg/patches' + + You can then issue commands of the form mq + pull from the main repository. + + + + A few things to watch out for + + MQ's support for working with a repository full of patches + is limited in a few small respects. + + MQ cannot automatically detect changes that you make to + the patch directory. If you hg + pull, manually edit, or hg + update changes to patches or the series file, you will have to + hg qpop and + then hg qpush in + the underlying repository to see those changes show up there. + If you forget to do this, you can confuse MQ's idea of which + patches are applied. + + + + + Third party tools for working with patches + + Once you've been working with patches for a while, you'll + find yourself hungry for tools that will help you to understand + and manipulate the patches you're dealing with. + + The diffstat command + web:diffstat generates a histogram of the + modifications made to each file in a patch. It provides a good + way to get a sense of a patch&emdash;which files + it affects, and how much change it introduces to each file and + as a whole. (I find that it's a good idea to use + diffstat's option as a matter of + course, as otherwise it will try to do clever things with + prefixes of file names that inevitably confuse at least + me.) + +&interaction.mq.tools.tools; + + The patchutils package + web:patchutils is invaluable. It provides a + set of small utilities that follow the Unix + philosophy; each does one useful thing with a patch. + The patchutils command I use + most is filterdiff, which extracts subsets + from a patch file. For example, given a patch that modifies + hundreds of files across dozens of directories, a single + invocation of filterdiff can generate a + smaller patch that only touches files whose names match a + particular glob pattern. See for another + example. + + + + Good ways to work with patches + + Whether you are working on a patch series to submit to a + free software or open source project, or a series that you + intend to treat as a sequence of regular changesets when you're + done, you can use some simple techniques to keep your work well + organized. + + Give your patches descriptive names. A good name for a + patch might be rework-device-alloc.patch, + because it will immediately give you a hint what the purpose of + the patch is. Long names shouldn't be a problem; you won't be + typing the names often, but you will be + running commands like qapplied and qtop over and over. Good naming + becomes especially important when you have a number of patches + to work with, or if you are juggling a number of different tasks + and your patches only get a fraction of your attention. + + Be aware of what patch you're working on. Use the qtop command and skim over the text + of your patches frequently&emdash;for example, using hg tip )&emdash;to be sure + of where you stand. I have several times worked on and qrefreshed a patch other than the + one I intended, and it's often tricky to migrate changes into + the right patch after making them in the wrong one. + + For this reason, it is very much worth investing a little + time to learn how to use some of the third-party tools I + described in , + particularly + diffstat and filterdiff. + The former will give you a quick idea of what changes your patch + is making, while the latter makes it easy to splice hunks + selectively out of one patch and into another. + + + + MQ cookbook + + + Manage <quote>trivial</quote> patches + + Because the overhead of dropping files into a new + Mercurial repository is so low, it makes a lot of sense to + manage patches this way even if you simply want to make a few + changes to a source tarball that you downloaded. + + Begin by downloading and unpacking the source tarball, and + turning it into a Mercurial repository. + + &interaction.mq.tarball.download; + + Continue by creating a patch stack and making your + changes. + + &interaction.mq.tarball.qinit; + + Let's say a few weeks or months pass, and your package + author releases a new version. First, bring their changes + into the repository. + + &interaction.mq.tarball.newsource; + + The pipeline starting with hg + locate above deletes all files in the working + directory, so that hg + commit's option can + actually tell which files have really been removed in the + newer version of the source. + + Finally, you can apply your patches on top of the new + tree. + + &interaction.mq.tarball.repush; + + + + Combining entire patches + + MQ provides a command, qfold that lets you combine + entire patches. This folds the patches you + name, in the order you name them, into the topmost applied + patch, and concatenates their descriptions onto the end of its + description. The patches that you fold must be unapplied + before you fold them. + + The order in which you fold patches matters. If your + topmost applied patch is foo, and you + qfold + bar and quux into it, + you will end up with a patch that has the same effect as if + you applied first foo, then + bar, followed by + quux. + + + + Merging part of one patch into another + + Merging part of one patch into + another is more difficult than combining entire + patches. + + If you want to move changes to entire files, you can use + filterdiff's and options to choose the + modifications to snip out of one patch, concatenating its + output onto the end of the patch you want to merge into. You + usually won't need to modify the patch you've merged the + changes from. Instead, MQ will report some rejected hunks + when you qpush it (from + the hunks you moved into the other patch), and you can simply + qrefresh the patch to drop + the duplicate hunks. + + If you have a patch that has multiple hunks modifying a + file, and you only want to move a few of those hunks, the job + becomes more messy, but you can still partly automate it. Use + lsdiff -nvv to print some metadata about + the patch. + + &interaction.mq.tools.lsdiff; + + This command prints three different kinds of + number: + + (in the first column) a file + number to identify each file modified in the + patch; + + (on the next line, indented) the line number + within a modified file where a hunk starts; and + + (on the same line) a hunk + number to identify that hunk. + + + You'll have to use some visual inspection, and reading of + the patch, to identify the file and hunk numbers you'll want, + but you can then pass them to to + filterdiff's and options, to + select exactly the file and hunk you want to extract. + + Once you have this hunk, you can concatenate it onto the + end of your destination patch and continue with the remainder + of . + + + + + Differences between quilt and MQ + + If you are already familiar with quilt, MQ provides a + similar command set. There are a few differences in the way + that it works. + + You will already have noticed that most quilt commands have + MQ counterparts that simply begin with a + q. The exceptions are quilt's + add and remove commands, + the counterparts for which are the normal Mercurial hg add and hg + remove commands. There is no MQ equivalent of the + quilt edit command. + + +
+ + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch13-hgext.xml --- a/en/ch13-hgext.xml Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,554 +0,0 @@ - - - - - Adding functionality with extensions - - While the core of Mercurial is quite complete from a - functionality standpoint, it's deliberately shorn of fancy - features. This approach of preserving simplicity keeps the - software easy to deal with for both maintainers and users. - - However, Mercurial doesn't box you in with an inflexible - command set: you can add features to it as - extensions (sometimes known as - plugins). We've already discussed a few of - these extensions in earlier chapters. - - - covers the fetch extension; - this combines pulling new changes and merging them with local - changes into a single command, fetch. - - In , we covered - several extensions that are useful for hook-related - functionality: acl adds - access control lists; bugzilla adds integration with the - Bugzilla bug tracking system; and notify sends notification emails on - new changes. - - The Mercurial Queues patch management extension is - so invaluable that it merits two chapters and an appendix all - to itself. covers the - basics; discusses advanced topics; - and goes into detail on - each - command. - - - In this chapter, we'll cover some of the other extensions that - are available for Mercurial, and briefly touch on some of the - machinery you'll need to know about if you want to write an - extension of your own. - - In , - we'll discuss the possibility of huge - performance improvements using the inotify extension. - - - - Improve performance with the <literal - role="hg-ext">inotify</literal> extension - - Are you interested in having some of the most common - Mercurial operations run as much as a hundred times faster? - Read on! - - Mercurial has great performance under normal circumstances. - For example, when you run the hg - status command, Mercurial has to scan almost every - directory and file in your repository so that it can display - file status. Many other Mercurial commands need to do the same - work behind the scenes; for example, the hg diff command uses the status - machinery to avoid doing an expensive comparison operation on - files that obviously haven't changed. - - Because obtaining file status is crucial to good - performance, the authors of Mercurial have optimised this code - to within an inch of its life. However, there's no avoiding the - fact that when you run hg - status, Mercurial is going to have to perform at - least one expensive system call for each managed file to - determine whether it's changed since the last time Mercurial - checked. For a sufficiently large repository, this can take a - long time. - - To put a number on the magnitude of this effect, I created a - repository containing 150,000 managed files. I timed hg status as taking ten seconds to - run, even when none of those files had been - modified. - - Many modern operating systems contain a file notification - facility. If a program signs up to an appropriate service, the - operating system will notify it every time a file of interest is - created, modified, or deleted. On Linux systems, the kernel - component that does this is called - inotify. - - Mercurial's inotify - extension talks to the kernel's inotify - component to optimise hg status - commands. The extension has two components. A daemon sits in - the background and receives notifications from the - inotify subsystem. It also listens for - connections from a regular Mercurial command. The extension - modifies Mercurial's behavior so that instead of scanning the - filesystem, it queries the daemon. Since the daemon has perfect - information about the state of the repository, it can respond - with a result instantaneously, avoiding the need to scan every - directory and file in the repository. - - Recall the ten seconds that I measured plain Mercurial as - taking to run hg status on a - 150,000 file repository. With the inotify extension enabled, the time - dropped to 0.1 seconds, a factor of one - hundred faster. - - Before we continue, please pay attention to some - caveats. - - The inotify - extension is Linux-specific. Because it interfaces directly - to the Linux kernel's inotify subsystem, - it does not work on other operating systems. - - It should work on any Linux distribution that - was released after early 2005. Older distributions are - likely to have a kernel that lacks - inotify, or a version of - glibc that does not have the necessary - interfacing support. - - Not all filesystems are suitable for use with - the inotify extension. - Network filesystems such as NFS are a non-starter, for - example, particularly if you're running Mercurial on several - systems, all mounting the same network filesystem. The - kernel's inotify system has no way of - knowing about changes made on another system. Most local - filesystems (e.g. ext3, XFS, ReiserFS) should work - fine. - - - The inotify extension is - not yet shipped with Mercurial as of May 2007, so it's a little - more involved to set up than other extensions. But the - performance improvement is worth it! - - The extension currently comes in two parts: a set of patches - to the Mercurial source code, and a library of Python bindings - to the inotify subsystem. - - There are two Python - inotify binding libraries. One of them is - called pyinotify, and is packaged by some - Linux distributions as python-inotify. - This is not the one you'll need, as it is - too buggy and inefficient to be practical. - - To get going, it's best to already have a functioning copy - of Mercurial installed. - - If you follow the instructions below, you'll be - replacing and overwriting any existing - installation of Mercurial that you might already have, using - the latest bleeding edge Mercurial code. Don't - say you weren't warned! - - - Clone the Python inotify - binding repository. Build and install it. - hg clone http://hg.kublai.com/python/inotify -cd inotify -python setup.py build --force -sudo python setup.py install --skip-build - - Clone the crew Mercurial repository. - Clone the inotify patch - repository so that Mercurial Queues will be able to apply - patches to your cope of the crew repository. - hg clone http://hg.intevation.org/mercurial/crew -hg clone crew inotify -hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches - - Make sure that you have the Mercurial Queues - extension, mq, enabled. If - you've never used MQ, read to get started - quickly. - - Go into the inotify repo, and apply all - of the inotify patches - using the option to the qpush command. - cd inotify -hg qpush -a - - If you get an error message from qpush, you should not continue. - Instead, ask for help. - - Build and install the patched version of - Mercurial. - python setup.py build --force -sudo python setup.py install --skip-build - - - Once you've build a suitably patched version of Mercurial, - all you need to do to enable the inotify extension is add an entry to - your ~/.hgrc. - [extensions] inotify = - When the inotify extension - is enabled, Mercurial will automatically and transparently start - the status daemon the first time you run a command that needs - status in a repository. It runs one status daemon per - repository. - - The status daemon is started silently, and runs in the - background. If you look at a list of running processes after - you've enabled the inotify - extension and run a few commands in different repositories, - you'll thus see a few hg processes sitting - around, waiting for updates from the kernel and queries from - Mercurial. - - The first time you run a Mercurial command in a repository - when you have the inotify - extension enabled, it will run with about the same performance - as a normal Mercurial command. This is because the status - daemon needs to perform a normal status scan so that it has a - baseline against which to apply later updates from the kernel. - However, every subsequent command that does - any kind of status check should be noticeably faster on - repositories of even fairly modest size. Better yet, the bigger - your repository is, the greater a performance advantage you'll - see. The inotify daemon makes - status operations almost instantaneous on repositories of all - sizes! - - If you like, you can manually start a status daemon using - the inserve command. - This gives you slightly finer control over how the daemon ought - to run. This command will of course only be available when the - inotify extension is - enabled. - - When you're using the inotify extension, you should notice - no difference at all in Mercurial's - behavior, with the sole exception of status-related commands - running a whole lot faster than they used to. You should - specifically expect that commands will not print different - output; neither should they give different results. If either of - these situations occurs, please report a bug. - - - - Flexible diff support with the <literal - role="hg-ext">extdiff</literal> extension - - Mercurial's built-in hg - diff command outputs plaintext unified diffs. - - &interaction.extdiff.diff; - - If you would like to use an external tool to display - modifications, you'll want to use the extdiff extension. This will let you - use, for example, a graphical diff tool. - - The extdiff extension is - bundled with Mercurial, so it's easy to set up. In the extensions section of your - ~/.hgrc, simply add a - one-line entry to enable the extension. - [extensions] -extdiff = - This introduces a command named extdiff, which by default uses - your system's diff command to generate a - unified diff in the same form as the built-in hg diff command. - - &interaction.extdiff.extdiff; - - The result won't be exactly the same as with the built-in - hg diff variations, because the - output of diff varies from one system to - another, even when passed the same options. - - As the making snapshot - lines of output above imply, the extdiff command works by - creating two snapshots of your source tree. The first snapshot - is of the source revision; the second, of the target revision or - working directory. The extdiff command generates - these snapshots in a temporary directory, passes the name of - each directory to an external diff viewer, then deletes the - temporary directory. For efficiency, it only snapshots the - directories and files that have changed between the two - revisions. - - Snapshot directory names have the same base name as your - repository. If your repository path is /quux/bar/foo, then foo will be the name of each - snapshot directory. Each snapshot directory name has its - changeset ID appended, if appropriate. If a snapshot is of - revision a631aca1083f, the directory will be - named foo.a631aca1083f. - A snapshot of the working directory won't have a changeset ID - appended, so it would just be foo in this example. To see what - this looks like in practice, look again at the extdiff example above. Notice - that the diff has the snapshot directory names embedded in its - header. - - The extdiff command - accepts two important options. The option - lets you choose a program to view differences with, instead of - diff. With the option, - you can change the options that extdiff passes to the program - (by default, these options are - -Npru, which only make sense - if you're running diff). In other respects, - the extdiff command - acts similarly to the built-in hg - diff command: you use the same option names, syntax, - and arguments to specify the revisions you want, the files you - want, and so on. - - As an example, here's how to run the normal system - diff command, getting it to generate context - diffs (using the option) - instead of unified diffs, and five lines of context instead of - the default three (passing 5 as the argument - to the option). - - &interaction.extdiff.extdiff-ctx; - - Launching a visual diff tool is just as easy. Here's how to - launch the kdiff3 viewer. - hg extdiff -p kdiff3 -o - - If your diff viewing command can't deal with directories, - you can easily work around this with a little scripting. For an - example of such scripting in action with the mq extension and the - interdiff command, see . - - - Defining command aliases - - It can be cumbersome to remember the options to both the - extdiff command and - the diff viewer you want to use, so the extdiff extension lets you define - new commands that will invoke your diff - viewer with exactly the right options. - - All you need to do is edit your ~/.hgrc, and add a section named - extdiff. Inside this - section, you can define multiple commands. Here's how to add - a kdiff3 command. Once you've defined - this, you can type hg kdiff3 - and the extdiff extension - will run kdiff3 for you. - [extdiff] -cmd.kdiff3 = - If you leave the right hand side of the definition empty, - as above, the extdiff - extension uses the name of the command you defined as the name - of the external program to run. But these names don't have to - be the same. Here, we define a command named - hg wibble, which runs - kdiff3. - [extdiff] - cmd.wibble = kdiff3 - - You can also specify the default options that you want to - invoke your diff viewing program with. The prefix to use is - opts., followed by the name - of the command to which the options apply. This example - defines a hg vimdiff command - that runs the vim editor's - DirDiff extension. - [extdiff] - cmd.vimdiff = vim -opts.vimdiff = -f '+next' '+execute "DirDiff" argv(0) argv(1)' - - - - - Cherrypicking changes with the <literal - role="hg-ext">transplant</literal> extension - - Need to have a long chat with Brendan about this. - - - - Send changes via email with the <literal - role="hg-ext">patchbomb</literal> extension - - Many projects have a culture of change - review, in which people send their modifications to a - mailing list for others to read and comment on before they - commit the final version to a shared repository. Some projects - have people who act as gatekeepers; they apply changes from - other people to a repository to which those others don't have - access. - - Mercurial makes it easy to send changes over email for - review or application, via its patchbomb extension. The extension is - so named because changes are formatted as patches, and it's usual - to send one changeset per email message. Sending a long series - of changes by email is thus much like bombing the - recipient's inbox, hence patchbomb. - - As usual, the basic configuration of the patchbomb extension takes just one or - two lines in your - /.hgrc. - [extensions] -patchbomb = - Once you've enabled the extension, you will have a new - command available, named email. - - The safest and best way to invoke the email command is to - always run it first with the option. - This will show you what the command would - send, without actually sending anything. Once you've had a - quick glance over the changes and verified that you are sending - the right ones, you can rerun the same command, with the option - removed. - - The email command - accepts the same kind of revision syntax as every other - Mercurial command. For example, this command will send every - revision between 7 and tip, inclusive. - hg email -n 7:tip - You can also specify a repository to - compare with. If you provide a repository but no revisions, the - email command will - send all revisions in the local repository that are not present - in the remote repository. If you additionally specify revisions - or a branch name (the latter using the option), - this will constrain the revisions sent. - - It's perfectly safe to run the email command without the - names of the people you want to send to: if you do this, it will - just prompt you for those values interactively. (If you're - using a Linux or Unix-like system, you should have enhanced - readline-style editing capabilities when - entering those headers, too, which is useful.) - - When you are sending just one revision, the email command will by - default use the first line of the changeset description as the - subject of the single email message it sends. - - If you send multiple revisions, the email command will usually - send one message per changeset. It will preface the series with - an introductory message, in which you should describe the - purpose of the series of changes you're sending. - - - Changing the behavior of patchbombs - - Not every project has exactly the same conventions for - sending changes in email; the patchbomb extension tries to - accommodate a number of variations through command line - options. - - You can write a subject for the introductory - message on the command line using the - option. This takes one argument, the text of the subject - to use. - - To change the email address from which the - messages originate, use the - option. This takes one argument, the email address to - use. - - The default behavior is to send unified diffs - (see for a - description of the - format), one per message. You can send a binary bundle - instead with the - option. - - Unified diffs are normally prefaced with a - metadata header. You can omit this, and send unadorned - diffs, with the option. - - Diffs are normally sent inline, - in the same body part as the description of a patch. This - makes it easiest for the largest number of readers to - quote and respond to parts of a diff, as some mail clients - will only quote the first MIME body part in a message. If - you'd prefer to send the description and the diff in - separate body parts, use the - option. - - Instead of sending mail messages, you can - write them to an mbox-format mail - folder using the - option. That option takes one argument, the name of the - file to write to. - - If you would like to add a - diffstat-format summary to each patch, - and one to the introductory message, use the - option. The diffstat command displays - a table containing the name of each file patched, the - number of lines affected, and a histogram showing how much - each file is modified. This gives readers a qualitative - glance at how complex a patch is. - - - - - - - diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch13-mq-collab.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch13-mq-collab.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,525 @@ + + + + + Advanced uses of Mercurial Queues + + While it's easy to pick up straightforward uses of Mercurial + Queues, use of a little discipline and some of MQ's less + frequently used capabilities makes it possible to work in + complicated development environments. + + In this chapter, I will use as an example a technique I have + used to manage the development of an Infiniband device driver for + the Linux kernel. The driver in question is large (at least as + drivers go), with 25,000 lines of code spread across 35 source + files. It is maintained by a small team of developers. + + While much of the material in this chapter is specific to + Linux, the same principles apply to any code base for which you're + not the primary owner, and upon which you need to do a lot of + development. + + + The problem of many targets + + The Linux kernel changes rapidly, and has never been + internally stable; developers frequently make drastic changes + between releases. This means that a version of the driver that + works well with a particular released version of the kernel will + not even compile correctly against, + typically, any other version. + + To maintain a driver, we have to keep a number of distinct + versions of Linux in mind. + + One target is the main Linux kernel development + tree. Maintenance of the code is in this case partly shared + by other developers in the kernel community, who make + drive-by modifications to the driver as they + develop and refine kernel subsystems. + + We also maintain a number of + backports to older versions of the Linux + kernel, to support the needs of customers who are running + older Linux distributions that do not incorporate our + drivers. (To backport a piece of code + is to modify it to work in an older version of its target + environment than the version it was developed for.) + + Finally, we make software releases on a schedule + that is necessarily not aligned with those used by Linux + distributors and kernel developers, so that we can deliver + new features to customers without forcing them to upgrade + their entire kernels or distributions. + + + + Tempting approaches that don't work well + + There are two standard ways to maintain a + piece of software that has to target many different + environments. + + The first is to maintain a number of branches, each + intended for a single target. The trouble with this approach + is that you must maintain iron discipline in the flow of + changes between repositories. A new feature or bug fix must + start life in a pristine repository, then + percolate out to every backport repository. Backport changes + are more limited in the branches they should propagate to; a + backport change that is applied to a branch where it doesn't + belong will probably stop the driver from compiling. + + The second is to maintain a single source tree filled with + conditional statements that turn chunks of code on or off + depending on the intended target. Because these + ifdefs are not allowed in the Linux kernel + tree, a manual or automatic process must be followed to strip + them out and yield a clean tree. A code base maintained in + this fashion rapidly becomes a rat's nest of conditional + blocks that are difficult to understand and maintain. + + Neither of these approaches is well suited to a situation + where you don't own the canonical copy of a + source tree. In the case of a Linux driver that is + distributed with the standard kernel, Linus's tree contains + the copy of the code that will be treated by the world as + canonical. The upstream version of my driver + can be modified by people I don't know, without me even + finding out about it until after the changes show up in + Linus's tree. + + These approaches have the added weakness of making it + difficult to generate well-formed patches to submit + upstream. + + In principle, Mercurial Queues seems like a good candidate + to manage a development scenario such as the above. While + this is indeed the case, MQ contains a few added features that + make the job more pleasant. + + + + + Conditionally applying patches with guards + + Perhaps the best way to maintain sanity with so many targets + is to be able to choose specific patches to apply for a given + situation. MQ provides a feature called guards + (which originates with quilt's guards + command) that does just this. To start off, let's create a + simple repository for experimenting in. + + &interaction.mq.guards.init; + + This gives us a tiny repository that contains two patches + that don't have any dependencies on each other, because they + touch different files. + + The idea behind conditional application is that you can + tag a patch with a guard, + which is simply a text string of your choosing, then tell MQ to + select specific guards to use when applying patches. MQ will + then either apply, or skip over, a guarded patch, depending on + the guards that you have selected. + + A patch can have an arbitrary number of guards; each one is + positive (apply this patch if this + guard is selected) or negative + (skip this patch if this guard is selected). A + patch with no guards is always applied. + + + + Controlling the guards on a patch + + The qguard command lets + you determine which guards should apply to a patch, or display + the guards that are already in effect. Without any arguments, it + displays the guards on the current topmost patch. + + &interaction.mq.guards.qguard; + + To set a positive guard on a patch, prefix the name of the + guard with a +. + + &interaction.mq.guards.qguard.pos; + + To set a negative guard + on a patch, prefix the name of the guard with a + -. + + &interaction.mq.guards.qguard.neg; + + Notice that we prefixed the arguments to the hg + qguard command with a -- here, so + that Mercurial would not interpret the text + -quux as an option. + + + Setting vs. modifying + + The qguard command + sets the guards on a patch; it doesn't + modify them. What this means is that if + you run hg qguard +a +b on a + patch, then hg qguard +c on + the same patch, the only guard that will + be set on it afterwards is +c. + + + Mercurial stores guards in the series file; the form in which they + are stored is easy both to understand and to edit by hand. (In + other words, you don't have to use the qguard command if you don't want + to; it's okay to simply edit the series file.) + + &interaction.mq.guards.series; + + + + Selecting the guards to use + + The qselect command + determines which guards are active at a given time. The effect + of this is to determine which patches MQ will apply the next + time you run qpush. It has + no other effect; in particular, it doesn't do anything to + patches that are already applied. + + With no arguments, the qselect command lists the guards + currently in effect, one per line of output. Each argument is + treated as the name of a guard to apply. + + &interaction.mq.guards.qselect.foo; + + In case you're interested, the currently selected guards are + stored in the guards file. + + &interaction.mq.guards.qselect.cat; + + We can see the effect the selected guards have when we run + qpush. + + &interaction.mq.guards.qselect.qpush; + + A guard cannot start with a + + or + - character. The name of a + guard must not contain white space, but most other characters + are acceptable. If you try to use a guard with an invalid name, + MQ will complain: + + &interaction.mq.guards.qselect.error; + + Changing the selected guards changes the patches that are + applied. + + &interaction.mq.guards.qselect.quux; + + You can see in the example below that negative guards take + precedence over positive guards. + + &interaction.mq.guards.qselect.foobar; + + + + MQ's rules for applying patches + + The rules that MQ uses when deciding whether to apply a + patch are as follows. + + A patch that has no guards is always + applied. + + If the patch has any negative guard that matches + any currently selected guard, the patch is skipped. + + If the patch has any positive guard that matches + any currently selected guard, the patch is applied. + + If the patch has positive or negative guards, + but none matches any currently selected guard, the patch is + skipped. + + + + + Trimming the work environment + + In working on the device driver I mentioned earlier, I don't + apply the patches to a normal Linux kernel tree. Instead, I use + a repository that contains only a snapshot of the source files + and headers that are relevant to Infiniband development. This + repository is 1% the size of a kernel repository, so it's easier + to work with. + + I then choose a base version on top of which + the patches are applied. This is a snapshot of the Linux kernel + tree as of a revision of my choosing. When I take the snapshot, + I record the changeset ID from the kernel repository in the + commit message. Since the snapshot preserves the + shape and content of the relevant parts of the + kernel tree, I can apply my patches on top of either my tiny + repository or a normal kernel tree. + + Normally, the base tree atop which the patches apply should + be a snapshot of a very recent upstream tree. This best + facilitates the development of patches that can easily be + submitted upstream with few or no modifications. + + + + Dividing up the <filename role="special">series</filename> + file + + I categorise the patches in the series file into a number of logical + groups. Each section of like patches begins with a block of + comments that describes the purpose of the patches that + follow. + + The sequence of patch groups that I maintain follows. The + ordering of these groups is important; I'll describe why after I + introduce the groups. + + The accepted group. Patches that + the development team has submitted to the maintainer of the + Infiniband subsystem, and which he has accepted, but which + are not present in the snapshot that the tiny repository is + based on. These are read only patches, + present only to transform the tree into a similar state as + it is in the upstream maintainer's repository. + + The rework group. Patches that I + have submitted, but that the upstream maintainer has + requested modifications to before he will accept + them. + + The pending group. Patches that + I have not yet submitted to the upstream maintainer, but + which we have finished working on. These will be read + only for a while. If the upstream maintainer + accepts them upon submission, I'll move them to the end of + the accepted group. If he requests that I + modify any, I'll move them to the beginning of the + rework group. + + The in progress group. Patches + that are actively being developed, and should not be + submitted anywhere yet. + + The backport group. Patches that + adapt the source tree to older versions of the kernel + tree. + + The do not ship group. Patches + that for some reason should never be submitted upstream. + For example, one such patch might change embedded driver + identification strings to make it easier to distinguish, in + the field, between an out-of-tree version of the driver and + a version shipped by a distribution vendor. + + + Now to return to the reasons for ordering groups of patches + in this way. We would like the lowest patches in the stack to + be as stable as possible, so that we will not need to rework + higher patches due to changes in context. Putting patches that + will never be changed first in the series file serves this + purpose. + + We would also like the patches that we know we'll need to + modify to be applied on top of a source tree that resembles the + upstream tree as closely as possible. This is why we keep + accepted patches around for a while. + + The backport and do not ship + patches float at the end of the series file. The backport patches + must be applied on top of all other patches, and the do + not ship patches might as well stay out of harm's + way. + + + + Maintaining the patch series + + In my work, I use a number of guards to control which + patches are to be applied. + + + Accepted patches are guarded with + accepted. I enable this guard most of + the time. When I'm applying the patches on top of a tree + where the patches are already present, I can turn this patch + off, and the patches that follow it will apply + cleanly. + + Patches that are finished, but + not yet submitted, have no guards. If I'm applying the + patch stack to a copy of the upstream tree, I don't need to + enable any guards in order to get a reasonably safe source + tree. + + Those patches that need reworking before being + resubmitted are guarded with + rework. + + For those patches that are still under + development, I use devel. + + A backport patch may have several guards, one + for each version of the kernel to which it applies. For + example, a patch that backports a piece of code to 2.6.9 + will have a 2.6.9 guard. + + This variety of guards gives me considerable flexibility in + determining what kind of source tree I want to end up with. For + most situations, the selection of appropriate guards is + automated during the build process, but I can manually tune the + guards to use for less common circumstances. + + + The art of writing backport patches + + Using MQ, writing a backport patch is a simple process. + All such a patch has to do is modify a piece of code that uses + a kernel feature not present in the older version of the + kernel, so that the driver continues to work correctly under + that older version. + + A useful goal when writing a good backport patch is to + make your code look as if it was written for the older version + of the kernel you're targeting. The less obtrusive the patch, + the easier it will be to understand and maintain. If you're + writing a collection of backport patches to avoid the + rat's nest effect of lots of + #ifdefs (hunks of source code that are only + used conditionally) in your code, don't introduce + version-dependent #ifdefs into the patches. + Instead, write several patches, each of which makes + unconditional changes, and control their application using + guards. + + There are two reasons to divide backport patches into a + distinct group, away from the regular patches + whose effects they modify. The first is that intermingling the + two makes it more difficult to use a tool like the patchbomb extension to automate the + process of submitting the patches to an upstream maintainer. + The second is that a backport patch could perturb the context + in which a subsequent regular patch is applied, making it + impossible to apply the regular patch cleanly + without the earlier backport patch + already being applied. + + + + + Useful tips for developing with MQ + + + Organising patches in directories + + If you're working on a substantial project with MQ, it's + not difficult to accumulate a large number of patches. For + example, I have one patch repository that contains over 250 + patches. + + If you can group these patches into separate logical + categories, you can if you like store them in different + directories; MQ has no problems with patch names that contain + path separators. + + + + Viewing the history of a patch + + If you're developing a set of patches over a long time, + it's a good idea to maintain them in a repository, as + discussed in . If you do + so, you'll quickly + discover that using the hg + diff command to look at the history of changes to + a patch is unworkable. This is in part because you're looking + at the second derivative of the real code (a diff of a diff), + but also because MQ adds noise to the process by modifying + time stamps and directory names when it updates a + patch. + + However, you can use the extdiff extension, which is bundled + with Mercurial, to turn a diff of two versions of a patch into + something readable. To do this, you will need a third-party + package called patchutils + web:patchutils. This provides a command + named interdiff, which shows the + differences between two diffs as a diff. Used on two versions + of the same diff, it generates a diff that represents the diff + from the first to the second version. + + You can enable the extdiff extension in the usual way, + by adding a line to the extensions section of your + ~/.hgrc. + [extensions] +extdiff = + The interdiff command expects to be + passed the names of two files, but the extdiff extension passes the program + it runs a pair of directories, each of which can contain an + arbitrary number of files. We thus need a small program that + will run interdiff on each pair of files in + these two directories. This program is available as hg-interdiff in the examples directory of the + source code repository that accompanies this book. + + With the hg-interdiff + program in your shell's search path, you can run it as + follows, from inside an MQ patch directory: + hg extdiff -p hg-interdiff -r A:B my-change.patch + Since you'll probably want to use this long-winded command + a lot, you can get hgext to + make it available as a normal Mercurial command, again by + editing your ~/.hgrc. + [extdiff] +cmd.interdiff = hg-interdiff + This directs hgext to + make an interdiff command available, so you + can now shorten the previous invocation of extdiff to something a + little more wieldy. + hg interdiff -r A:B my-change.patch + + + The interdiff command works well + only if the underlying files against which versions of a + patch are generated remain the same. If you create a patch, + modify the underlying files, and then regenerate the patch, + interdiff may not produce useful + output. + + + The extdiff extension is + useful for more than merely improving the presentation of MQ + patches. To read more about it, go to . + + + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/ch14-hgext.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/ch14-hgext.xml Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,554 @@ + + + + + Adding functionality with extensions + + While the core of Mercurial is quite complete from a + functionality standpoint, it's deliberately shorn of fancy + features. This approach of preserving simplicity keeps the + software easy to deal with for both maintainers and users. + + However, Mercurial doesn't box you in with an inflexible + command set: you can add features to it as + extensions (sometimes known as + plugins). We've already discussed a few of + these extensions in earlier chapters. + + + covers the fetch extension; + this combines pulling new changes and merging them with local + changes into a single command, fetch. + + In , we covered + several extensions that are useful for hook-related + functionality: acl adds + access control lists; bugzilla adds integration with the + Bugzilla bug tracking system; and notify sends notification emails on + new changes. + + The Mercurial Queues patch management extension is + so invaluable that it merits two chapters and an appendix all + to itself. covers the + basics; discusses advanced topics; + and goes into detail on + each + command. + + + In this chapter, we'll cover some of the other extensions that + are available for Mercurial, and briefly touch on some of the + machinery you'll need to know about if you want to write an + extension of your own. + + In , + we'll discuss the possibility of huge + performance improvements using the inotify extension. + + + + Improve performance with the <literal + role="hg-ext">inotify</literal> extension + + Are you interested in having some of the most common + Mercurial operations run as much as a hundred times faster? + Read on! + + Mercurial has great performance under normal circumstances. + For example, when you run the hg + status command, Mercurial has to scan almost every + directory and file in your repository so that it can display + file status. Many other Mercurial commands need to do the same + work behind the scenes; for example, the hg diff command uses the status + machinery to avoid doing an expensive comparison operation on + files that obviously haven't changed. + + Because obtaining file status is crucial to good + performance, the authors of Mercurial have optimised this code + to within an inch of its life. However, there's no avoiding the + fact that when you run hg + status, Mercurial is going to have to perform at + least one expensive system call for each managed file to + determine whether it's changed since the last time Mercurial + checked. For a sufficiently large repository, this can take a + long time. + + To put a number on the magnitude of this effect, I created a + repository containing 150,000 managed files. I timed hg status as taking ten seconds to + run, even when none of those files had been + modified. + + Many modern operating systems contain a file notification + facility. If a program signs up to an appropriate service, the + operating system will notify it every time a file of interest is + created, modified, or deleted. On Linux systems, the kernel + component that does this is called + inotify. + + Mercurial's inotify + extension talks to the kernel's inotify + component to optimise hg status + commands. The extension has two components. A daemon sits in + the background and receives notifications from the + inotify subsystem. It also listens for + connections from a regular Mercurial command. The extension + modifies Mercurial's behavior so that instead of scanning the + filesystem, it queries the daemon. Since the daemon has perfect + information about the state of the repository, it can respond + with a result instantaneously, avoiding the need to scan every + directory and file in the repository. + + Recall the ten seconds that I measured plain Mercurial as + taking to run hg status on a + 150,000 file repository. With the inotify extension enabled, the time + dropped to 0.1 seconds, a factor of one + hundred faster. + + Before we continue, please pay attention to some + caveats. + + The inotify + extension is Linux-specific. Because it interfaces directly + to the Linux kernel's inotify subsystem, + it does not work on other operating systems. + + It should work on any Linux distribution that + was released after early 2005. Older distributions are + likely to have a kernel that lacks + inotify, or a version of + glibc that does not have the necessary + interfacing support. + + Not all filesystems are suitable for use with + the inotify extension. + Network filesystems such as NFS are a non-starter, for + example, particularly if you're running Mercurial on several + systems, all mounting the same network filesystem. The + kernel's inotify system has no way of + knowing about changes made on another system. Most local + filesystems (e.g. ext3, XFS, ReiserFS) should work + fine. + + + The inotify extension is + not yet shipped with Mercurial as of May 2007, so it's a little + more involved to set up than other extensions. But the + performance improvement is worth it! + + The extension currently comes in two parts: a set of patches + to the Mercurial source code, and a library of Python bindings + to the inotify subsystem. + + There are two Python + inotify binding libraries. One of them is + called pyinotify, and is packaged by some + Linux distributions as python-inotify. + This is not the one you'll need, as it is + too buggy and inefficient to be practical. + + To get going, it's best to already have a functioning copy + of Mercurial installed. + + If you follow the instructions below, you'll be + replacing and overwriting any existing + installation of Mercurial that you might already have, using + the latest bleeding edge Mercurial code. Don't + say you weren't warned! + + + Clone the Python inotify + binding repository. Build and install it. + hg clone http://hg.kublai.com/python/inotify +cd inotify +python setup.py build --force +sudo python setup.py install --skip-build + + Clone the crew Mercurial repository. + Clone the inotify patch + repository so that Mercurial Queues will be able to apply + patches to your cope of the crew repository. + hg clone http://hg.intevation.org/mercurial/crew +hg clone crew inotify +hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches + + Make sure that you have the Mercurial Queues + extension, mq, enabled. If + you've never used MQ, read to get started + quickly. + + Go into the inotify repo, and apply all + of the inotify patches + using the option to the qpush command. + cd inotify +hg qpush -a + + If you get an error message from qpush, you should not continue. + Instead, ask for help. + + Build and install the patched version of + Mercurial. + python setup.py build --force +sudo python setup.py install --skip-build + + + Once you've build a suitably patched version of Mercurial, + all you need to do to enable the inotify extension is add an entry to + your ~/.hgrc. + [extensions] inotify = + When the inotify extension + is enabled, Mercurial will automatically and transparently start + the status daemon the first time you run a command that needs + status in a repository. It runs one status daemon per + repository. + + The status daemon is started silently, and runs in the + background. If you look at a list of running processes after + you've enabled the inotify + extension and run a few commands in different repositories, + you'll thus see a few hg processes sitting + around, waiting for updates from the kernel and queries from + Mercurial. + + The first time you run a Mercurial command in a repository + when you have the inotify + extension enabled, it will run with about the same performance + as a normal Mercurial command. This is because the status + daemon needs to perform a normal status scan so that it has a + baseline against which to apply later updates from the kernel. + However, every subsequent command that does + any kind of status check should be noticeably faster on + repositories of even fairly modest size. Better yet, the bigger + your repository is, the greater a performance advantage you'll + see. The inotify daemon makes + status operations almost instantaneous on repositories of all + sizes! + + If you like, you can manually start a status daemon using + the inserve command. + This gives you slightly finer control over how the daemon ought + to run. This command will of course only be available when the + inotify extension is + enabled. + + When you're using the inotify extension, you should notice + no difference at all in Mercurial's + behavior, with the sole exception of status-related commands + running a whole lot faster than they used to. You should + specifically expect that commands will not print different + output; neither should they give different results. If either of + these situations occurs, please report a bug. + + + + Flexible diff support with the <literal + role="hg-ext">extdiff</literal> extension + + Mercurial's built-in hg + diff command outputs plaintext unified diffs. + + &interaction.extdiff.diff; + + If you would like to use an external tool to display + modifications, you'll want to use the extdiff extension. This will let you + use, for example, a graphical diff tool. + + The extdiff extension is + bundled with Mercurial, so it's easy to set up. In the extensions section of your + ~/.hgrc, simply add a + one-line entry to enable the extension. + [extensions] +extdiff = + This introduces a command named extdiff, which by default uses + your system's diff command to generate a + unified diff in the same form as the built-in hg diff command. + + &interaction.extdiff.extdiff; + + The result won't be exactly the same as with the built-in + hg diff variations, because the + output of diff varies from one system to + another, even when passed the same options. + + As the making snapshot + lines of output above imply, the extdiff command works by + creating two snapshots of your source tree. The first snapshot + is of the source revision; the second, of the target revision or + working directory. The extdiff command generates + these snapshots in a temporary directory, passes the name of + each directory to an external diff viewer, then deletes the + temporary directory. For efficiency, it only snapshots the + directories and files that have changed between the two + revisions. + + Snapshot directory names have the same base name as your + repository. If your repository path is /quux/bar/foo, then foo will be the name of each + snapshot directory. Each snapshot directory name has its + changeset ID appended, if appropriate. If a snapshot is of + revision a631aca1083f, the directory will be + named foo.a631aca1083f. + A snapshot of the working directory won't have a changeset ID + appended, so it would just be foo in this example. To see what + this looks like in practice, look again at the extdiff example above. Notice + that the diff has the snapshot directory names embedded in its + header. + + The extdiff command + accepts two important options. The option + lets you choose a program to view differences with, instead of + diff. With the option, + you can change the options that extdiff passes to the program + (by default, these options are + -Npru, which only make sense + if you're running diff). In other respects, + the extdiff command + acts similarly to the built-in hg + diff command: you use the same option names, syntax, + and arguments to specify the revisions you want, the files you + want, and so on. + + As an example, here's how to run the normal system + diff command, getting it to generate context + diffs (using the option) + instead of unified diffs, and five lines of context instead of + the default three (passing 5 as the argument + to the option). + + &interaction.extdiff.extdiff-ctx; + + Launching a visual diff tool is just as easy. Here's how to + launch the kdiff3 viewer. + hg extdiff -p kdiff3 -o + + If your diff viewing command can't deal with directories, + you can easily work around this with a little scripting. For an + example of such scripting in action with the mq extension and the + interdiff command, see . + + + Defining command aliases + + It can be cumbersome to remember the options to both the + extdiff command and + the diff viewer you want to use, so the extdiff extension lets you define + new commands that will invoke your diff + viewer with exactly the right options. + + All you need to do is edit your ~/.hgrc, and add a section named + extdiff. Inside this + section, you can define multiple commands. Here's how to add + a kdiff3 command. Once you've defined + this, you can type hg kdiff3 + and the extdiff extension + will run kdiff3 for you. + [extdiff] +cmd.kdiff3 = + If you leave the right hand side of the definition empty, + as above, the extdiff + extension uses the name of the command you defined as the name + of the external program to run. But these names don't have to + be the same. Here, we define a command named + hg wibble, which runs + kdiff3. + [extdiff] + cmd.wibble = kdiff3 + + You can also specify the default options that you want to + invoke your diff viewing program with. The prefix to use is + opts., followed by the name + of the command to which the options apply. This example + defines a hg vimdiff command + that runs the vim editor's + DirDiff extension. + [extdiff] + cmd.vimdiff = vim +opts.vimdiff = -f '+next' '+execute "DirDiff" argv(0) argv(1)' + + + + + Cherrypicking changes with the <literal + role="hg-ext">transplant</literal> extension + + Need to have a long chat with Brendan about this. + + + + Send changes via email with the <literal + role="hg-ext">patchbomb</literal> extension + + Many projects have a culture of change + review, in which people send their modifications to a + mailing list for others to read and comment on before they + commit the final version to a shared repository. Some projects + have people who act as gatekeepers; they apply changes from + other people to a repository to which those others don't have + access. + + Mercurial makes it easy to send changes over email for + review or application, via its patchbomb extension. The extension is + so named because changes are formatted as patches, and it's usual + to send one changeset per email message. Sending a long series + of changes by email is thus much like bombing the + recipient's inbox, hence patchbomb. + + As usual, the basic configuration of the patchbomb extension takes just one or + two lines in your + /.hgrc. + [extensions] +patchbomb = + Once you've enabled the extension, you will have a new + command available, named email. + + The safest and best way to invoke the email command is to + always run it first with the option. + This will show you what the command would + send, without actually sending anything. Once you've had a + quick glance over the changes and verified that you are sending + the right ones, you can rerun the same command, with the option + removed. + + The email command + accepts the same kind of revision syntax as every other + Mercurial command. For example, this command will send every + revision between 7 and tip, inclusive. + hg email -n 7:tip + You can also specify a repository to + compare with. If you provide a repository but no revisions, the + email command will + send all revisions in the local repository that are not present + in the remote repository. If you additionally specify revisions + or a branch name (the latter using the option), + this will constrain the revisions sent. + + It's perfectly safe to run the email command without the + names of the people you want to send to: if you do this, it will + just prompt you for those values interactively. (If you're + using a Linux or Unix-like system, you should have enhanced + readline-style editing capabilities when + entering those headers, too, which is useful.) + + When you are sending just one revision, the email command will by + default use the first line of the changeset description as the + subject of the single email message it sends. + + If you send multiple revisions, the email command will usually + send one message per changeset. It will preface the series with + an introductory message, in which you should describe the + purpose of the series of changes you're sending. + + + Changing the behavior of patchbombs + + Not every project has exactly the same conventions for + sending changes in email; the patchbomb extension tries to + accommodate a number of variations through command line + options. + + You can write a subject for the introductory + message on the command line using the + option. This takes one argument, the text of the subject + to use. + + To change the email address from which the + messages originate, use the + option. This takes one argument, the email address to + use. + + The default behavior is to send unified diffs + (see for a + description of the + format), one per message. You can send a binary bundle + instead with the + option. + + Unified diffs are normally prefaced with a + metadata header. You can omit this, and send unadorned + diffs, with the option. + + Diffs are normally sent inline, + in the same body part as the description of a patch. This + makes it easiest for the largest number of readers to + quote and respond to parts of a diff, as some mail clients + will only quote the first MIME body part in a message. If + you'd prefer to send the description and the diff in + separate body parts, use the + option. + + Instead of sending mail messages, you can + write them to an mbox-format mail + folder using the + option. That option takes one argument, the name of the + file to write to. + + If you would like to add a + diffstat-format summary to each patch, + and one to the introductory message, use the + option. The diffstat command displays + a table containing the name of each file patched, the + number of lines affected, and a histogram showing how much + each file is modified. This gives readers a qualitative + glance at how complex a patch is. + + + + + + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/auto-snippets.xml --- a/en/examples/auto-snippets.xml Thu Jul 09 13:32:44 2009 +0900 +++ b/en/examples/auto-snippets.xml Fri Jul 10 02:32:17 2009 +0900 @@ -1,4 +1,5 @@ + @@ -51,6 +52,25 @@ + + + + + + + + + + + + + + + + + + + @@ -60,6 +80,13 @@ + + + + + + + @@ -70,6 +97,19 @@ + + + + + + + + + + + + + @@ -114,8 +154,6 @@ - - @@ -208,6 +246,8 @@ + + diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/backout --- a/en/examples/backout Thu Jul 09 13:32:44 2009 +0900 +++ b/en/examples/backout Fri Jul 10 02:32:17 2009 +0900 @@ -68,6 +68,10 @@ hg heads +#$ name: + +echo 'first change' > myfile + #$ name: manual.cat cat myfile diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/bisect --- a/en/examples/bisect Thu Jul 09 13:32:44 2009 +0900 +++ b/en/examples/bisect Fri Jul 10 02:32:17 2009 +0900 @@ -37,15 +37,15 @@ #$ name: search.init -hg bisect init +hg bisect --reset #$ name: search.bad-init -hg bisect bad +hg bisect --bad #$ name: search.good-init -hg bisect good 10 +hg bisect --good 10 #$ name: search.step1 @@ -70,7 +70,7 @@ fi echo this revision is $result - hg bisect $result + hg bisect --$result } #$ name: search.step2 @@ -85,7 +85,7 @@ #$ name: search.reset -hg bisect reset +hg bisect --reset #$ name: diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/ch01/new --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/examples/ch01/new Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,39 @@ +#!/bin/bash + +cat > hello.c < goodbye.c < a +hg ci -Ama + +#$ name: rename.basic + +hg rename a b +hg diff + +#$ name: rename.git + +hg diff -g + +#$ name: + +hg revert -a +rm b + +#$ name: chmod + +chmod +x a +hg st +hg diff + +#$ name: chmod.git + +hg diff -g diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/ch09/check_whitespace.py.lst --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/examples/ch09/check_whitespace.py.lst Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,47 @@ +#!/usr/bin/env python +# +# save as .hg/check_whitespace.py and make executable + +import re + +def trailing_whitespace(difflines): + # + linenum, header = 0, False + + for line in difflines: + if header: + # remember the name of the file that this diff affects + m = re.match(r'(?:---|\+\+\+) ([^\t]+)', line) + if m and m.group(1) != '/dev/null': + filename = m.group(1).split('/', 1)[-1] + if line.startswith('+++ '): + header = False + continue + if line.startswith('diff '): + header = True + continue + # hunk header - save the line number + m = re.match(r'@@ -\d+,\d+ \+(\d+),', line) + if m: + linenum = int(m.group(1)) + continue + # hunk body - check for an added line with trailing whitespace + m = re.match(r'\+.*\s$', line) + if m: + yield filename, linenum + if line and line[0] in ' +': + linenum += 1 + +if __name__ == '__main__': + import os, sys + + added = 0 + for filename, linenum in trailing_whitespace(os.popen('hg export tip')): + print >> sys.stderr, ('%s, line %d: trailing whitespace added' % + (filename, linenum)) + added += 1 + if added: + # save the commit message so we don't need to retype it + os.system('hg tip --template "{desc}" > .hg/commit.save') + print >> sys.stderr, 'commit message saved to .hg/commit.save' + sys.exit(1) diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/ch09/hook.ws --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/examples/ch09/hook.ws Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,32 @@ +#!/bin/bash + +hg init a +cd a +echo '[hooks]' > .hg/hgrc +echo "pretxncommit.whitespace = hg export tip | (! egrep -q '^\\+.*[ \\t]$')" >> .hg/hgrc + +#$ name: simple + +cat .hg/hgrc +echo 'a ' > a +hg commit -A -m 'test with trailing whitespace' +echo 'a' > a +hg commit -A -m 'drop trailing whitespace and try again' + +#$ name: + +echo '[hooks]' > .hg/hgrc +echo "pretxncommit.whitespace = .hg/check_whitespace.py" >> .hg/hgrc +cp $EXAMPLE_DIR/ch09/check_whitespace.py.lst .hg/check_whitespace.py +chmod +x .hg/check_whitespace.py + +#$ name: better + +cat .hg/hgrc +echo 'a ' >> a +hg commit -A -m 'add new line with trailing whitespace' +sed -i 's, *$,,' a +hg commit -A -m 'trimmed trailing whitespace' + +#$ name: +exit 0 diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/ch10/multiline --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/examples/ch10/multiline Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,13 @@ +#!/bin/sh + +hg init +echo a > test.c +hg ci -Am'First commit' + +#$ name: go + +cat > multiline << EOF +changeset = "Changed in {node|short}:\n{files}" +file = " {file}\n" +EOF +hg log --style multiline diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/ch11/qdelete --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/examples/ch11/qdelete Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,32 @@ +#!/bin/bash + +echo '[extensions]' >> $HGRC +echo 'hgext.mq =' >> $HGRC + +#$ name: go + +hg init myrepo +cd myrepo +hg qinit +hg qnew bad.patch +echo a > a +hg add a +hg qrefresh +hg qdelete bad.patch +hg qpop +hg qdelete bad.patch + +#$ name: convert + +hg qnew good.patch +echo a > a +hg add a +hg qrefresh -m 'Good change' +hg qfinish tip +hg qapplied +hg tip --style=compact + +#$ name: import + +hg qimport -r tip +hg qapplied diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/daily.copy --- a/en/examples/daily.copy Thu Jul 09 13:32:44 2009 +0900 +++ b/en/examples/daily.copy Fri Jul 10 02:32:17 2009 +0900 @@ -51,9 +51,9 @@ cd copy-example echo a > a echo b > b -mkdir c -mkdir c/a -echo c > c/a/c +mkdir z +mkdir z/a +echo c > z/a/c hg ci -Ama #$ name: simple @@ -70,13 +70,13 @@ #$ name: dir-src -hg copy c e +hg copy z e #$ name: dir-src-dest -hg copy c d +hg copy z d #$ name: after -cp a z -hg copy --after a z +cp a n +hg copy --after a n diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/data/check_whitespace.py --- a/en/examples/data/check_whitespace.py Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,44 +0,0 @@ -#!/usr/bin/python - -import re - -def trailing_whitespace(difflines): - added, linenum, header = [], 0, False - - for line in difflines: - if header: - # remember the name of the file that this diff affects - m = re.match(r'(?:---|\+\+\+) ([^\t]+)', line) - if m and m.group(1) != '/dev/null': - filename = m.group(1).split('/', 1)[-1] - if line.startswith('+++ '): - header = False - continue - if line.startswith('diff '): - header = True - continue - # hunk header - save the line number - m = re.match(r'@@ -\d+,\d+ \+(\d+),', line) - if m: - linenum = int(m.group(1)) - continue - # hunk body - check for an added line with trailing whitespace - m = re.match(r'\+.*\s$', line) - if m: - added.append((filename, linenum)) - if line and line[0] in ' +': - linenum += 1 - return added - -if __name__ == '__main__': - import os, sys - - added = trailing_whitespace(os.popen('hg export tip')) - if added: - for filename, linenum in added: - print >> sys.stderr, ('%s, line %d: trailing whitespace added' % - (filename, linenum)) - # save the commit message so we don't need to retype it - os.system('hg tip --template "{desc}" > .hg/commit.save') - print >> sys.stderr, 'commit message saved to .hg/commit.save' - sys.exit(1) diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/hook.ws --- a/en/examples/hook.ws Thu Jul 09 13:32:44 2009 +0900 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,31 +0,0 @@ -#!/bin/bash - -hg init a -cd a -echo '[hooks]' > .hg/hgrc -echo "pretxncommit.whitespace = hg export tip | (! egrep -q '^\\+.*[ \\t]$')" >> .hg/hgrc - -#$ name: simple - -cat .hg/hgrc -echo 'a ' > a -hg commit -A -m 'test with trailing whitespace' -echo 'a' > a -hg commit -A -m 'drop trailing whitespace and try again' - -#$ name: - -echo '[hooks]' > .hg/hgrc -echo "pretxncommit.whitespace = .hg/check_whitespace.py" >> .hg/hgrc -cp $EXAMPLE_DIR/data/check_whitespace.py .hg - -#$ name: better - -cat .hg/hgrc -echo 'a ' >> a -hg commit -A -m 'add new line with trailing whitespace' -sed -i 's, *$,,' a -hg commit -A -m 'trimmed trailing whitespace' - -#$ name: -exit 0 diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/mq.guards --- a/en/examples/mq.guards Thu Jul 09 13:32:44 2009 +0900 +++ b/en/examples/mq.guards Fri Jul 10 02:32:17 2009 +0900 @@ -29,7 +29,7 @@ #$ name: qguard.neg -hg qguard hello.patch -quux +hg qguard -- hello.patch -quux hg qguard hello.patch #$ name: series diff -r 5276f40fca1c -r 896ab6eaf1c6 en/examples/tour --- a/en/examples/tour Thu Jul 09 13:32:44 2009 +0900 +++ b/en/examples/tour Fri Jul 10 02:32:17 2009 +0900 @@ -119,6 +119,7 @@ hg update 2 hg parents hg update +hg parents #$ name: clone-push @@ -148,24 +149,32 @@ #$ name: cp hello.c ../new-hello.c -sed -i '/printf/i\\tprintf("once more, hello.\\n");' ../new-hello.c +sed -i '/printf("hello,/i\\tprintf("once more, hello.\\n");' ../new-hello.c + +my-text-editor() +{ +cp ../new-hello.c hello.c +} #$ name: merge.clone cd .. hg clone hello my-new-hello cd my-new-hello -# The file new-hello.c is lightly edited. -cp ../new-hello.c hello.c +# Make some simple edits to hello.c. +my-text-editor hello.c hg commit -m 'A new hello for a new day.' #$ name: merge.dummy2 hg log -r 5 | grep changeset | cut -c 16-19 2>/dev/null > /tmp/REV5.my-new-hello -#$ name: merge.cat +#$ name: merge.cat1 cat hello.c + +#$ name: merge.cat2 + cat ../my-hello/hello.c #$ name: merge.pull diff -r 5276f40fca1c -r 896ab6eaf1c6 en/figs/bad-merge-1.dot --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/figs/bad-merge-1.dot Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,13 @@ +digraph bad_merge_1 { + ancestor [label="1: ancestor"]; + left [label="2: my change"]; + right [label="3: your change"]; + bad [label="4: bad merge"]; + new [label="5: new change"]; + + ancestor -> left; + ancestor -> right; + left -> bad; + right -> bad; + bad -> new; +} diff -r 5276f40fca1c -r 896ab6eaf1c6 en/figs/bad-merge-2.dot --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/figs/bad-merge-2.dot Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,18 @@ +digraph bad_merge_2 { + ancestor [label="1: ancestor",color=grey,fontcolor=grey]; + left [label="2: my change",color=grey,fontcolor=grey]; + right [label="3: your change",color=grey,fontcolor=grey]; + bad [label="4: bad merge",color=grey,fontcolor=grey]; + new [label="5: new change",color=grey,fontcolor=grey]; + + bak_left [label="6: backout 1 of\nbad merge",shape=box]; + + ancestor -> left [color=grey]; + ancestor -> right [color=grey]; + left -> bad [color=grey]; + right -> bad [color=grey]; + bad -> new [color=grey]; + + bad -> bak_left; + left -> bak_left [style=dotted,label="--parent=2"]; +} diff -r 5276f40fca1c -r 896ab6eaf1c6 en/figs/bad-merge-3.dot --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/figs/bad-merge-3.dot Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,22 @@ +digraph bad_merge_3 { + ancestor [label="1: ancestor",color="#bbbbbb",fontcolor="#bbbbbb"]; + left [label="2: my change",color="#bbbbbb",fontcolor="#bbbbbb"]; + right [label="3: your change",color="#bbbbbb",fontcolor="#bbbbbb"]; + bad [label="4: bad merge",color="#bbbbbb",fontcolor="#bbbbbb"]; + new [label="5: new change",color="#bbbbbb",fontcolor="#bbbbbb"]; + + bak_left [label="6: backout 1 of\nbad merge",color=grey,shape=box]; + bak_right [label="8: backout 2 of\nbad merge",shape=box]; + + ancestor -> left [color="#bbbbbb"]; + ancestor -> right [color="#bbbbbb"]; + left -> bad [color="#bbbbbb"]; + right -> bad [color="#bbbbbb"]; + bad -> new [color="#bbbbbb"]; + + bad -> bak_left [color=grey]; + left -> bak_left [style=dotted,label="--parent=2",color=grey,fontcolor=grey]; + + bad -> bak_right; + right -> bak_right [style=dotted,label="--parent=3"]; +} diff -r 5276f40fca1c -r 896ab6eaf1c6 en/figs/bad-merge-4.dot --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/figs/bad-merge-4.dot Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,26 @@ +digraph bad_merge_4 { + ancestor [label="1: ancestor",color="#bbbbbb",fontcolor="#bbbbbb"]; + left [label="2: my change",color="#bbbbbb",fontcolor="#bbbbbb"]; + right [label="3: your change",color="#bbbbbb",fontcolor="#bbbbbb"]; + bad [label="4: bad merge",color="#bbbbbb",fontcolor="#bbbbbb"]; + new [label="5: new change",color="#bbbbbb",fontcolor="#bbbbbb"]; + + bak_left [label="6: backout 1 of\nbad merge",color=grey,fontcolor=grey,shape=box]; + bak_right [label="7: backout 2 of\nbad merge",color=grey,fontcolor=grey,shape=box]; + good [label="8: merge\nof backouts",shape=box]; + + ancestor -> left [color="#bbbbbb"]; + ancestor -> right [color="#bbbbbb"]; + left -> bad [color="#bbbbbb"]; + right -> bad [color="#bbbbbb"]; + bad -> new [color="#bbbbbb"]; + + bad -> bak_left [color=grey]; + left -> bak_left [style=dotted,label="--parent=2",color=grey,fontcolor=grey]; + + bad -> bak_right [color=grey]; + right -> bak_right [style=dotted,label="--parent=3",color=grey,fontcolor=grey]; + + bak_left -> good; + bak_right -> good; +} diff -r 5276f40fca1c -r 896ab6eaf1c6 en/figs/bad-merge-5.dot --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/figs/bad-merge-5.dot Fri Jul 10 02:32:17 2009 +0900 @@ -0,0 +1,30 @@ +digraph bad_merge_5 { + ancestor [label="1: ancestor",color="#bbbbbb",fontcolor="#bbbbbb"]; + left [label="2: my change",color="#bbbbbb",fontcolor="#bbbbbb"]; + right [label="3: your change",color="#bbbbbb",fontcolor="#bbbbbb"]; + bad [label="4: bad merge",color="#bbbbbb",fontcolor="#bbbbbb"]; + new [label="5: new change",color=grey,fontcolor=grey]; + + bak_left [label="6: backout 1 of\nbad merge",color="#bbbbbb",fontcolor="#bbbbbb",shape=box]; + bak_right [label="7: backout 2 of\nbad merge",color="#bbbbbb",fontcolor="#bbbbbb",shape=box]; + good [label="8: merge\nof backouts",color=grey,fontcolor=grey,shape=box]; + last [label="9: merge with\nnew change",shape=box]; + + ancestor -> left [color="#bbbbbb"]; + ancestor -> right [color="#bbbbbb"]; + left -> bad [color="#bbbbbb"]; + right -> bad [color="#bbbbbb"]; + bad -> new [color="#bbbbbb"]; + + bad -> bak_left [color="#bbbbbb"]; + left -> bak_left [style=dotted,label="--parent=2",color="#bbbbbb",fontcolor="#bbbbbb"]; + + bad -> bak_right [color="#bbbbbb"]; + right -> bak_right [style=dotted,label="--parent=3",color="#bbbbbb",fontcolor="#bbbbbb"]; + + bak_left -> good [color=grey]; + bak_right -> good [color=grey]; + + good -> last; + new -> last; +} diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/Makefile --- a/ja/Makefile Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/Makefile Fri Jul 10 02:32:17 2009 +0900 @@ -35,7 +35,6 @@ filenames \ hook.msglen \ hook.simple \ - hook.ws \ issue29 \ mq.guards \ mq.qinit-help \ diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/Makefile.orig --- a/ja/Makefile.orig Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/Makefile.orig Fri Jul 10 02:32:17 2009 +0900 @@ -73,7 +73,6 @@ filenames \ hook.msglen \ hook.simple \ - hook.ws \ issue29 \ mq.guards \ mq.qinit-help \ diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/examples/auto-snippets.xml --- a/ja/examples/auto-snippets.xml Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/examples/auto-snippets.xml Fri Jul 10 02:32:17 2009 +0900 @@ -1,4 +1,5 @@ + @@ -51,6 +52,25 @@ + + + + + + + + + + + + + + + + + + + @@ -60,6 +80,13 @@ + + + + + + + @@ -70,6 +97,19 @@ + + + + + + + + + + + + + @@ -114,8 +154,6 @@ - - @@ -208,6 +246,8 @@ + + diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/examples/backout --- a/ja/examples/backout Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/examples/backout Fri Jul 10 02:32:17 2009 +0900 @@ -68,6 +68,10 @@ hg heads +#$ name: + +echo 'first change' > myfile + #$ name: manual.cat cat myfile diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/examples/bisect --- a/ja/examples/bisect Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/examples/bisect Fri Jul 10 02:32:17 2009 +0900 @@ -37,15 +37,15 @@ #$ name: search.init -hg bisect init +hg bisect --reset #$ name: search.bad-init -hg bisect bad +hg bisect --bad #$ name: search.good-init -hg bisect good 10 +hg bisect --good 10 #$ name: search.step1 @@ -70,7 +70,7 @@ fi echo this revision is $result - hg bisect $result + hg bisect --$result } #$ name: search.step2 @@ -85,7 +85,7 @@ #$ name: search.reset -hg bisect reset +hg bisect --reset #$ name: diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/examples/daily.copy --- a/ja/examples/daily.copy Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/examples/daily.copy Fri Jul 10 02:32:17 2009 +0900 @@ -51,9 +51,9 @@ cd copy-example echo a > a echo b > b -mkdir c -mkdir c/a -echo c > c/a/c +mkdir z +mkdir z/a +echo c > z/a/c hg ci -Ama #$ name: simple @@ -70,13 +70,13 @@ #$ name: dir-src -hg copy c e +hg copy z e #$ name: dir-src-dest -hg copy c d +hg copy z d #$ name: after -cp a z -hg copy --after a z +cp a n +hg copy --after a n diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/examples/mq.guards --- a/ja/examples/mq.guards Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/examples/mq.guards Fri Jul 10 02:32:17 2009 +0900 @@ -29,7 +29,7 @@ #$ name: qguard.neg -hg qguard hello.patch -quux +hg qguard -- hello.patch -quux hg qguard hello.patch #$ name: series diff -r 5276f40fca1c -r 896ab6eaf1c6 ja/examples/tour --- a/ja/examples/tour Thu Jul 09 13:32:44 2009 +0900 +++ b/ja/examples/tour Fri Jul 10 02:32:17 2009 +0900 @@ -119,6 +119,7 @@ hg update 2 hg parents hg update +hg parents #$ name: clone-push @@ -148,24 +149,32 @@ #$ name: cp hello.c ../new-hello.c -sed -i '/printf/i\\tprintf("once more, hello.\\n");' ../new-hello.c +sed -i '/printf("hello,/i\\tprintf("once more, hello.\\n");' ../new-hello.c + +my-text-editor() +{ +cp ../new-hello.c hello.c +} #$ name: merge.clone cd .. hg clone hello my-new-hello cd my-new-hello -# The file new-hello.c is lightly edited. -cp ../new-hello.c hello.c +# Make some simple edits to hello.c. +my-text-editor hello.c hg commit -m 'A new hello for a new day.' #$ name: merge.dummy2 hg log -r 5 | grep changeset | cut -c 16-19 2>/dev/null > /tmp/REV5.my-new-hello -#$ name: merge.cat +#$ name: merge.cat1 cat hello.c + +#$ name: merge.cat2 + cat ../my-hello/hello.c #$ name: merge.pull diff -r 5276f40fca1c -r 896ab6eaf1c6 web/genindex.py --- a/web/genindex.py Thu Jul 09 13:32:44 2009 +0900 +++ b/web/genindex.py Fri Jul 10 02:32:17 2009 +0900 @@ -6,7 +6,8 @@ filename_re = re.compile(r'<\?dbhtml filename="([^"]+)"\?>') title_re = re.compile(r'(.*)') -chapters = glob.glob('../en/ch*.xml') + glob.glob('../en/app*.xml') +chapters = (sorted(glob.glob('../en/ch*.xml')) + + sorted(glob.glob('../en/app*.xml'))) fp = open('index-read.html.in', 'w') diff -r 5276f40fca1c -r 896ab6eaf1c6 web/index.html.in --- a/web/index.html.in Thu Jul 09 13:32:44 2009 +0900 +++ b/web/index.html.in Fri Jul 10 02:32:17 2009 +0900 @@ -19,10 +19,14 @@

You can contribute!

-

I publish the source - code for this book as a Mercurial repository. Please feel +

I publish the source code for this book + as a + Mercurial repository. Please feel welcome to clone it, make modifications to your copy, and send me - changes.

+ changes. Getting a copy of the source takes just a few seconds if + you have Mercurial installed:

+ +
hg clone http://hg.serpentine.com/mercurial/book

The online version of the book includes a comment system that you can use to send feedback involving errors, omissions, and