Mercurial > hgbook
comparison ja/filenames.tex @ 290:b0db5adf11c1 ja_root
fork Japanese translation.
author | Yoshiki Yazawa <yaz@cc.rim.or.jp> |
---|---|
date | Wed, 06 Feb 2008 17:43:11 +0900 |
parents | en/filenames.tex@f8a2fe77908d |
children | 3b1291f24c0d |
comparison
equal
deleted
inserted
replaced
289:7be02466421b | 290:b0db5adf11c1 |
---|---|
1 \chapter{File names and pattern matching} | |
2 \label{chap:names} | |
3 | |
4 Mercurial provides mechanisms that let you work with file names in a | |
5 consistent and expressive way. | |
6 | |
7 \section{Simple file naming} | |
8 | |
9 Mercurial uses a unified piece of machinery ``under the hood'' to | |
10 handle file names. Every command behaves uniformly with respect to | |
11 file names. The way in which commands work with file names is as | |
12 follows. | |
13 | |
14 If you explicitly name real files on the command line, Mercurial works | |
15 with exactly those files, as you would expect. | |
16 \interaction{filenames.files} | |
17 | |
18 When you provide a directory name, Mercurial will interpret this as | |
19 ``operate on every file in this directory and its subdirectories''. | |
20 Mercurial traverses the files and subdirectories in a directory in | |
21 alphabetical order. When it encounters a subdirectory, it will | |
22 traverse that subdirectory before continuing with the current | |
23 directory. | |
24 \interaction{filenames.dirs} | |
25 | |
26 \section{Running commands without any file names} | |
27 | |
28 Mercurial's commands that work with file names have useful default | |
29 behaviours when you invoke them without providing any file names or | |
30 patterns. What kind of behaviour you should expect depends on what | |
31 the command does. Here are a few rules of thumb you can use to | |
32 predict what a command is likely to do if you don't give it any names | |
33 to work with. | |
34 \begin{itemize} | |
35 \item Most commands will operate on the entire working directory. | |
36 This is what the \hgcmd{add} command does, for example. | |
37 \item If the command has effects that are difficult or impossible to | |
38 reverse, it will force you to explicitly provide at least one name | |
39 or pattern (see below). This protects you from accidentally | |
40 deleting files by running \hgcmd{remove} with no arguments, for | |
41 example. | |
42 \end{itemize} | |
43 | |
44 It's easy to work around these default behaviours if they don't suit | |
45 you. If a command normally operates on the whole working directory, | |
46 you can invoke it on just the current directory and its subdirectories | |
47 by giving it the name ``\dirname{.}''. | |
48 \interaction{filenames.wdir-subdir} | |
49 | |
50 Along the same lines, some commands normally print file names relative | |
51 to the root of the repository, even if you're invoking them from a | |
52 subdirectory. Such a command will print file names relative to your | |
53 subdirectory if you give it explicit names. Here, we're going to run | |
54 \hgcmd{status} from a subdirectory, and get it to operate on the | |
55 entire working directory while printing file names relative to our | |
56 subdirectory, by passing it the output of the \hgcmd{root} command. | |
57 \interaction{filenames.wdir-relname} | |
58 | |
59 \section{Telling you what's going on} | |
60 | |
61 The \hgcmd{add} example in the preceding section illustrates something | |
62 else that's helpful about Mercurial commands. If a command operates | |
63 on a file that you didn't name explicitly on the command line, it will | |
64 usually print the name of the file, so that you will not be surprised | |
65 what's going on. | |
66 | |
67 The principle here is of \emph{least surprise}. If you've exactly | |
68 named a file on the command line, there's no point in repeating it | |
69 back at you. If Mercurial is acting on a file \emph{implicitly}, | |
70 because you provided no names, or a directory, or a pattern (see | |
71 below), it's safest to tell you what it's doing. | |
72 | |
73 For commands that behave this way, you can silence them using the | |
74 \hggopt{-q} option. You can also get them to print the name of every | |
75 file, even those you've named explicitly, using the \hggopt{-v} | |
76 option. | |
77 | |
78 \section{Using patterns to identify files} | |
79 | |
80 In addition to working with file and directory names, Mercurial lets | |
81 you use \emph{patterns} to identify files. Mercurial's pattern | |
82 handling is expressive. | |
83 | |
84 On Unix-like systems (Linux, MacOS, etc.), the job of matching file | |
85 names to patterns normally falls to the shell. On these systems, you | |
86 must explicitly tell Mercurial that a name is a pattern. On Windows, | |
87 the shell does not expand patterns, so Mercurial will automatically | |
88 identify names that are patterns, and expand them for you. | |
89 | |
90 To provide a pattern in place of a regular name on the command line, | |
91 the mechanism is simple: | |
92 \begin{codesample2} | |
93 syntax:patternbody | |
94 \end{codesample2} | |
95 That is, a pattern is identified by a short text string that says what | |
96 kind of pattern this is, followed by a colon, followed by the actual | |
97 pattern. | |
98 | |
99 Mercurial supports two kinds of pattern syntax. The most frequently | |
100 used is called \texttt{glob}; this is the same kind of pattern | |
101 matching used by the Unix shell, and should be familiar to Windows | |
102 command prompt users, too. | |
103 | |
104 When Mercurial does automatic pattern matching on Windows, it uses | |
105 \texttt{glob} syntax. You can thus omit the ``\texttt{glob:}'' prefix | |
106 on Windows, but it's safe to use it, too. | |
107 | |
108 The \texttt{re} syntax is more powerful; it lets you specify patterns | |
109 using regular expressions, also known as regexps. | |
110 | |
111 By the way, in the examples that follow, notice that I'm careful to | |
112 wrap all of my patterns in quote characters, so that they won't get | |
113 expanded by the shell before Mercurial sees them. | |
114 | |
115 \subsection{Shell-style \texttt{glob} patterns} | |
116 | |
117 This is an overview of the kinds of patterns you can use when you're | |
118 matching on glob patterns. | |
119 | |
120 The ``\texttt{*}'' character matches any string, within a single | |
121 directory. | |
122 \interaction{filenames.glob.star} | |
123 | |
124 The ``\texttt{**}'' pattern matches any string, and crosses directory | |
125 boundaries. It's not a standard Unix glob token, but it's accepted by | |
126 several popular Unix shells, and is very useful. | |
127 \interaction{filenames.glob.starstar} | |
128 | |
129 The ``\texttt{?}'' pattern matches any single character. | |
130 \interaction{filenames.glob.question} | |
131 | |
132 The ``\texttt{[}'' character begins a \emph{character class}. This | |
133 matches any single character within the class. The class ends with a | |
134 ``\texttt{]}'' character. A class may contain multiple \emph{range}s | |
135 of the form ``\texttt{a-f}'', which is shorthand for | |
136 ``\texttt{abcdef}''. | |
137 \interaction{filenames.glob.range} | |
138 If the first character after the ``\texttt{[}'' in a character class | |
139 is a ``\texttt{!}'', it \emph{negates} the class, making it match any | |
140 single character not in the class. | |
141 | |
142 A ``\texttt{\{}'' begins a group of subpatterns, where the whole group | |
143 matches if any subpattern in the group matches. The ``\texttt{,}'' | |
144 character separates subpatterns, and ``\texttt{\}}'' ends the group. | |
145 \interaction{filenames.glob.group} | |
146 | |
147 \subsubsection{Watch out!} | |
148 | |
149 Don't forget that if you want to match a pattern in any directory, you | |
150 should not be using the ``\texttt{*}'' match-any token, as this will | |
151 only match within one directory. Instead, use the ``\texttt{**}'' | |
152 token. This small example illustrates the difference between the two. | |
153 \interaction{filenames.glob.star-starstar} | |
154 | |
155 \subsection{Regular expression matching with \texttt{re} patterns} | |
156 | |
157 Mercurial accepts the same regular expression syntax as the Python | |
158 programming language (it uses Python's regexp engine internally). | |
159 This is based on the Perl language's regexp syntax, which is the most | |
160 popular dialect in use (it's also used in Java, for example). | |
161 | |
162 I won't discuss Mercurial's regexp dialect in any detail here, as | |
163 regexps are not often used. Perl-style regexps are in any case | |
164 already exhaustively documented on a multitude of web sites, and in | |
165 many books. Instead, I will focus here on a few things you should | |
166 know if you find yourself needing to use regexps with Mercurial. | |
167 | |
168 A regexp is matched against an entire file name, relative to the root | |
169 of the repository. In other words, even if you're already in | |
170 subbdirectory \dirname{foo}, if you want to match files under this | |
171 directory, your pattern must start with ``\texttt{foo/}''. | |
172 | |
173 One thing to note, if you're familiar with Perl-style regexps, is that | |
174 Mercurial's are \emph{rooted}. That is, a regexp starts matching | |
175 against the beginning of a string; it doesn't look for a match | |
176 anywhere within the string. To match anywhere in a string, start | |
177 your pattern with ``\texttt{.*}''. | |
178 | |
179 \section{Filtering files} | |
180 | |
181 Not only does Mercurial give you a variety of ways to specify files; | |
182 it lets you further winnow those files using \emph{filters}. Commands | |
183 that work with file names accept two filtering options. | |
184 \begin{itemize} | |
185 \item \hggopt{-I}, or \hggopt{--include}, lets you specify a pattern | |
186 that file names must match in order to be processed. | |
187 \item \hggopt{-X}, or \hggopt{--exclude}, gives you a way to | |
188 \emph{avoid} processing files, if they match this pattern. | |
189 \end{itemize} | |
190 You can provide multiple \hggopt{-I} and \hggopt{-X} options on the | |
191 command line, and intermix them as you please. Mercurial interprets | |
192 the patterns you provide using glob syntax by default (but you can use | |
193 regexps if you need to). | |
194 | |
195 You can read a \hggopt{-I} filter as ``process only the files that | |
196 match this filter''. | |
197 \interaction{filenames.filter.include} | |
198 The \hggopt{-X} filter is best read as ``process only the files that | |
199 don't match this pattern''. | |
200 \interaction{filenames.filter.exclude} | |
201 | |
202 \section{Ignoring unwanted files and directories} | |
203 | |
204 XXX. | |
205 | |
206 \section{Case sensitivity} | |
207 \label{sec:names:case} | |
208 | |
209 If you're working in a mixed development environment that contains | |
210 both Linux (or other Unix) systems and Macs or Windows systems, you | |
211 should keep in the back of your mind the knowledge that they treat the | |
212 case (``N'' versus ``n'') of file names in incompatible ways. This is | |
213 not very likely to affect you, and it's easy to deal with if it does, | |
214 but it could surprise you if you don't know about it. | |
215 | |
216 Operating systems and filesystems differ in the way they handle the | |
217 \emph{case} of characters in file and directory names. There are | |
218 three common ways to handle case in names. | |
219 \begin{itemize} | |
220 \item Completely case insensitive. Uppercase and lowercase versions | |
221 of a letter are treated as identical, both when creating a file and | |
222 during subsequent accesses. This is common on older DOS-based | |
223 systems. | |
224 \item Case preserving, but insensitive. When a file or directory is | |
225 created, the case of its name is stored, and can be retrieved and | |
226 displayed by the operating system. When an existing file is being | |
227 looked up, its case is ignored. This is the standard arrangement on | |
228 Windows and MacOS. The names \filename{foo} and \filename{FoO} | |
229 identify the same file. This treatment of uppercase and lowercase | |
230 letters as interchangeable is also referred to as \emph{case | |
231 folding}. | |
232 \item Case sensitive. The case of a name is significant at all times. | |
233 The names \filename{foo} and {FoO} identify different files. This | |
234 is the way Linux and Unix systems normally work. | |
235 \end{itemize} | |
236 | |
237 On Unix-like systems, it is possible to have any or all of the above | |
238 ways of handling case in action at once. For example, if you use a | |
239 USB thumb drive formatted with a FAT32 filesystem on a Linux system, | |
240 Linux will handle names on that filesystem in a case preserving, but | |
241 insensitive, way. | |
242 | |
243 \subsection{Safe, portable repository storage} | |
244 | |
245 Mercurial's repository storage mechanism is \emph{case safe}. It | |
246 translates file names so that they can be safely stored on both case | |
247 sensitive and case insensitive filesystems. This means that you can | |
248 use normal file copying tools to transfer a Mercurial repository onto, | |
249 for example, a USB thumb drive, and safely move that drive and | |
250 repository back and forth between a Mac, a PC running Windows, and a | |
251 Linux box. | |
252 | |
253 \subsection{Detecting case conflicts} | |
254 | |
255 When operating in the working directory, Mercurial honours the naming | |
256 policy of the filesystem where the working directory is located. If | |
257 the filesystem is case preserving, but insensitive, Mercurial will | |
258 treat names that differ only in case as the same. | |
259 | |
260 An important aspect of this approach is that it is possible to commit | |
261 a changeset on a case sensitive (typically Linux or Unix) filesystem | |
262 that will cause trouble for users on case insensitive (usually Windows | |
263 and MacOS) users. If a Linux user commits changes to two files, one | |
264 named \filename{myfile.c} and the other named \filename{MyFile.C}, | |
265 they will be stored correctly in the repository. And in the working | |
266 directories of other Linux users, they will be correctly represented | |
267 as separate files. | |
268 | |
269 If a Windows or Mac user pulls this change, they will not initially | |
270 have a problem, because Mercurial's repository storage mechanism is | |
271 case safe. However, once they try to \hgcmd{update} the working | |
272 directory to that changeset, or \hgcmd{merge} with that changeset, | |
273 Mercurial will spot the conflict between the two file names that the | |
274 filesystem would treat as the same, and forbid the update or merge | |
275 from occurring. | |
276 | |
277 \subsection{Fixing a case conflict} | |
278 | |
279 If you are using Windows or a Mac in a mixed environment where some of | |
280 your collaborators are using Linux or Unix, and Mercurial reports a | |
281 case folding conflict when you try to \hgcmd{update} or \hgcmd{merge}, | |
282 the procedure to fix the problem is simple. | |
283 | |
284 Just find a nearby Linux or Unix box, clone the problem repository | |
285 onto it, and use Mercurial's \hgcmd{rename} command to change the | |
286 names of any offending files or directories so that they will no | |
287 longer cause case folding conflicts. Commit this change, \hgcmd{pull} | |
288 or \hgcmd{push} it across to your Windows or MacOS system, and | |
289 \hgcmd{update} to the revision with the non-conflicting names. | |
290 | |
291 The changeset with case-conflicting names will remain in your | |
292 project's history, and you still won't be able to \hgcmd{update} your | |
293 working directory to that changeset on a Windows or MacOS system, but | |
294 you can continue development unimpeded. | |
295 | |
296 \begin{note} | |
297 Prior to version~0.9.3, Mercurial did not use a case safe repository | |
298 storage mechanism, and did not detect case folding conflicts. If | |
299 you are using an older version of Mercurial on Windows or MacOS, I | |
300 strongly recommend that you upgrade. | |
301 \end{note} | |
302 | |
303 %%% Local Variables: | |
304 %%% mode: latex | |
305 %%% TeX-master: "00book" | |
306 %%% End: |