comparison ja/filenames.tex @ 290:b0db5adf11c1 ja_root

fork Japanese translation.
author Yoshiki Yazawa <yaz@cc.rim.or.jp>
date Wed, 06 Feb 2008 17:43:11 +0900
parents en/filenames.tex@f8a2fe77908d
children 3b1291f24c0d
comparison
equal deleted inserted replaced
289:7be02466421b 290:b0db5adf11c1
1 \chapter{File names and pattern matching}
2 \label{chap:names}
3
4 Mercurial provides mechanisms that let you work with file names in a
5 consistent and expressive way.
6
7 \section{Simple file naming}
8
9 Mercurial uses a unified piece of machinery ``under the hood'' to
10 handle file names. Every command behaves uniformly with respect to
11 file names. The way in which commands work with file names is as
12 follows.
13
14 If you explicitly name real files on the command line, Mercurial works
15 with exactly those files, as you would expect.
16 \interaction{filenames.files}
17
18 When you provide a directory name, Mercurial will interpret this as
19 ``operate on every file in this directory and its subdirectories''.
20 Mercurial traverses the files and subdirectories in a directory in
21 alphabetical order. When it encounters a subdirectory, it will
22 traverse that subdirectory before continuing with the current
23 directory.
24 \interaction{filenames.dirs}
25
26 \section{Running commands without any file names}
27
28 Mercurial's commands that work with file names have useful default
29 behaviours when you invoke them without providing any file names or
30 patterns. What kind of behaviour you should expect depends on what
31 the command does. Here are a few rules of thumb you can use to
32 predict what a command is likely to do if you don't give it any names
33 to work with.
34 \begin{itemize}
35 \item Most commands will operate on the entire working directory.
36 This is what the \hgcmd{add} command does, for example.
37 \item If the command has effects that are difficult or impossible to
38 reverse, it will force you to explicitly provide at least one name
39 or pattern (see below). This protects you from accidentally
40 deleting files by running \hgcmd{remove} with no arguments, for
41 example.
42 \end{itemize}
43
44 It's easy to work around these default behaviours if they don't suit
45 you. If a command normally operates on the whole working directory,
46 you can invoke it on just the current directory and its subdirectories
47 by giving it the name ``\dirname{.}''.
48 \interaction{filenames.wdir-subdir}
49
50 Along the same lines, some commands normally print file names relative
51 to the root of the repository, even if you're invoking them from a
52 subdirectory. Such a command will print file names relative to your
53 subdirectory if you give it explicit names. Here, we're going to run
54 \hgcmd{status} from a subdirectory, and get it to operate on the
55 entire working directory while printing file names relative to our
56 subdirectory, by passing it the output of the \hgcmd{root} command.
57 \interaction{filenames.wdir-relname}
58
59 \section{Telling you what's going on}
60
61 The \hgcmd{add} example in the preceding section illustrates something
62 else that's helpful about Mercurial commands. If a command operates
63 on a file that you didn't name explicitly on the command line, it will
64 usually print the name of the file, so that you will not be surprised
65 what's going on.
66
67 The principle here is of \emph{least surprise}. If you've exactly
68 named a file on the command line, there's no point in repeating it
69 back at you. If Mercurial is acting on a file \emph{implicitly},
70 because you provided no names, or a directory, or a pattern (see
71 below), it's safest to tell you what it's doing.
72
73 For commands that behave this way, you can silence them using the
74 \hggopt{-q} option. You can also get them to print the name of every
75 file, even those you've named explicitly, using the \hggopt{-v}
76 option.
77
78 \section{Using patterns to identify files}
79
80 In addition to working with file and directory names, Mercurial lets
81 you use \emph{patterns} to identify files. Mercurial's pattern
82 handling is expressive.
83
84 On Unix-like systems (Linux, MacOS, etc.), the job of matching file
85 names to patterns normally falls to the shell. On these systems, you
86 must explicitly tell Mercurial that a name is a pattern. On Windows,
87 the shell does not expand patterns, so Mercurial will automatically
88 identify names that are patterns, and expand them for you.
89
90 To provide a pattern in place of a regular name on the command line,
91 the mechanism is simple:
92 \begin{codesample2}
93 syntax:patternbody
94 \end{codesample2}
95 That is, a pattern is identified by a short text string that says what
96 kind of pattern this is, followed by a colon, followed by the actual
97 pattern.
98
99 Mercurial supports two kinds of pattern syntax. The most frequently
100 used is called \texttt{glob}; this is the same kind of pattern
101 matching used by the Unix shell, and should be familiar to Windows
102 command prompt users, too.
103
104 When Mercurial does automatic pattern matching on Windows, it uses
105 \texttt{glob} syntax. You can thus omit the ``\texttt{glob:}'' prefix
106 on Windows, but it's safe to use it, too.
107
108 The \texttt{re} syntax is more powerful; it lets you specify patterns
109 using regular expressions, also known as regexps.
110
111 By the way, in the examples that follow, notice that I'm careful to
112 wrap all of my patterns in quote characters, so that they won't get
113 expanded by the shell before Mercurial sees them.
114
115 \subsection{Shell-style \texttt{glob} patterns}
116
117 This is an overview of the kinds of patterns you can use when you're
118 matching on glob patterns.
119
120 The ``\texttt{*}'' character matches any string, within a single
121 directory.
122 \interaction{filenames.glob.star}
123
124 The ``\texttt{**}'' pattern matches any string, and crosses directory
125 boundaries. It's not a standard Unix glob token, but it's accepted by
126 several popular Unix shells, and is very useful.
127 \interaction{filenames.glob.starstar}
128
129 The ``\texttt{?}'' pattern matches any single character.
130 \interaction{filenames.glob.question}
131
132 The ``\texttt{[}'' character begins a \emph{character class}. This
133 matches any single character within the class. The class ends with a
134 ``\texttt{]}'' character. A class may contain multiple \emph{range}s
135 of the form ``\texttt{a-f}'', which is shorthand for
136 ``\texttt{abcdef}''.
137 \interaction{filenames.glob.range}
138 If the first character after the ``\texttt{[}'' in a character class
139 is a ``\texttt{!}'', it \emph{negates} the class, making it match any
140 single character not in the class.
141
142 A ``\texttt{\{}'' begins a group of subpatterns, where the whole group
143 matches if any subpattern in the group matches. The ``\texttt{,}''
144 character separates subpatterns, and ``\texttt{\}}'' ends the group.
145 \interaction{filenames.glob.group}
146
147 \subsubsection{Watch out!}
148
149 Don't forget that if you want to match a pattern in any directory, you
150 should not be using the ``\texttt{*}'' match-any token, as this will
151 only match within one directory. Instead, use the ``\texttt{**}''
152 token. This small example illustrates the difference between the two.
153 \interaction{filenames.glob.star-starstar}
154
155 \subsection{Regular expression matching with \texttt{re} patterns}
156
157 Mercurial accepts the same regular expression syntax as the Python
158 programming language (it uses Python's regexp engine internally).
159 This is based on the Perl language's regexp syntax, which is the most
160 popular dialect in use (it's also used in Java, for example).
161
162 I won't discuss Mercurial's regexp dialect in any detail here, as
163 regexps are not often used. Perl-style regexps are in any case
164 already exhaustively documented on a multitude of web sites, and in
165 many books. Instead, I will focus here on a few things you should
166 know if you find yourself needing to use regexps with Mercurial.
167
168 A regexp is matched against an entire file name, relative to the root
169 of the repository. In other words, even if you're already in
170 subbdirectory \dirname{foo}, if you want to match files under this
171 directory, your pattern must start with ``\texttt{foo/}''.
172
173 One thing to note, if you're familiar with Perl-style regexps, is that
174 Mercurial's are \emph{rooted}. That is, a regexp starts matching
175 against the beginning of a string; it doesn't look for a match
176 anywhere within the string. To match anywhere in a string, start
177 your pattern with ``\texttt{.*}''.
178
179 \section{Filtering files}
180
181 Not only does Mercurial give you a variety of ways to specify files;
182 it lets you further winnow those files using \emph{filters}. Commands
183 that work with file names accept two filtering options.
184 \begin{itemize}
185 \item \hggopt{-I}, or \hggopt{--include}, lets you specify a pattern
186 that file names must match in order to be processed.
187 \item \hggopt{-X}, or \hggopt{--exclude}, gives you a way to
188 \emph{avoid} processing files, if they match this pattern.
189 \end{itemize}
190 You can provide multiple \hggopt{-I} and \hggopt{-X} options on the
191 command line, and intermix them as you please. Mercurial interprets
192 the patterns you provide using glob syntax by default (but you can use
193 regexps if you need to).
194
195 You can read a \hggopt{-I} filter as ``process only the files that
196 match this filter''.
197 \interaction{filenames.filter.include}
198 The \hggopt{-X} filter is best read as ``process only the files that
199 don't match this pattern''.
200 \interaction{filenames.filter.exclude}
201
202 \section{Ignoring unwanted files and directories}
203
204 XXX.
205
206 \section{Case sensitivity}
207 \label{sec:names:case}
208
209 If you're working in a mixed development environment that contains
210 both Linux (or other Unix) systems and Macs or Windows systems, you
211 should keep in the back of your mind the knowledge that they treat the
212 case (``N'' versus ``n'') of file names in incompatible ways. This is
213 not very likely to affect you, and it's easy to deal with if it does,
214 but it could surprise you if you don't know about it.
215
216 Operating systems and filesystems differ in the way they handle the
217 \emph{case} of characters in file and directory names. There are
218 three common ways to handle case in names.
219 \begin{itemize}
220 \item Completely case insensitive. Uppercase and lowercase versions
221 of a letter are treated as identical, both when creating a file and
222 during subsequent accesses. This is common on older DOS-based
223 systems.
224 \item Case preserving, but insensitive. When a file or directory is
225 created, the case of its name is stored, and can be retrieved and
226 displayed by the operating system. When an existing file is being
227 looked up, its case is ignored. This is the standard arrangement on
228 Windows and MacOS. The names \filename{foo} and \filename{FoO}
229 identify the same file. This treatment of uppercase and lowercase
230 letters as interchangeable is also referred to as \emph{case
231 folding}.
232 \item Case sensitive. The case of a name is significant at all times.
233 The names \filename{foo} and {FoO} identify different files. This
234 is the way Linux and Unix systems normally work.
235 \end{itemize}
236
237 On Unix-like systems, it is possible to have any or all of the above
238 ways of handling case in action at once. For example, if you use a
239 USB thumb drive formatted with a FAT32 filesystem on a Linux system,
240 Linux will handle names on that filesystem in a case preserving, but
241 insensitive, way.
242
243 \subsection{Safe, portable repository storage}
244
245 Mercurial's repository storage mechanism is \emph{case safe}. It
246 translates file names so that they can be safely stored on both case
247 sensitive and case insensitive filesystems. This means that you can
248 use normal file copying tools to transfer a Mercurial repository onto,
249 for example, a USB thumb drive, and safely move that drive and
250 repository back and forth between a Mac, a PC running Windows, and a
251 Linux box.
252
253 \subsection{Detecting case conflicts}
254
255 When operating in the working directory, Mercurial honours the naming
256 policy of the filesystem where the working directory is located. If
257 the filesystem is case preserving, but insensitive, Mercurial will
258 treat names that differ only in case as the same.
259
260 An important aspect of this approach is that it is possible to commit
261 a changeset on a case sensitive (typically Linux or Unix) filesystem
262 that will cause trouble for users on case insensitive (usually Windows
263 and MacOS) users. If a Linux user commits changes to two files, one
264 named \filename{myfile.c} and the other named \filename{MyFile.C},
265 they will be stored correctly in the repository. And in the working
266 directories of other Linux users, they will be correctly represented
267 as separate files.
268
269 If a Windows or Mac user pulls this change, they will not initially
270 have a problem, because Mercurial's repository storage mechanism is
271 case safe. However, once they try to \hgcmd{update} the working
272 directory to that changeset, or \hgcmd{merge} with that changeset,
273 Mercurial will spot the conflict between the two file names that the
274 filesystem would treat as the same, and forbid the update or merge
275 from occurring.
276
277 \subsection{Fixing a case conflict}
278
279 If you are using Windows or a Mac in a mixed environment where some of
280 your collaborators are using Linux or Unix, and Mercurial reports a
281 case folding conflict when you try to \hgcmd{update} or \hgcmd{merge},
282 the procedure to fix the problem is simple.
283
284 Just find a nearby Linux or Unix box, clone the problem repository
285 onto it, and use Mercurial's \hgcmd{rename} command to change the
286 names of any offending files or directories so that they will no
287 longer cause case folding conflicts. Commit this change, \hgcmd{pull}
288 or \hgcmd{push} it across to your Windows or MacOS system, and
289 \hgcmd{update} to the revision with the non-conflicting names.
290
291 The changeset with case-conflicting names will remain in your
292 project's history, and you still won't be able to \hgcmd{update} your
293 working directory to that changeset on a Windows or MacOS system, but
294 you can continue development unimpeded.
295
296 \begin{note}
297 Prior to version~0.9.3, Mercurial did not use a case safe repository
298 storage mechanism, and did not detect case folding conflicts. If
299 you are using an older version of Mercurial on Windows or MacOS, I
300 strongly recommend that you upgrade.
301 \end{note}
302
303 %%% Local Variables:
304 %%% mode: latex
305 %%% TeX-master: "00book"
306 %%% End: