[ Back to Table of Contents ]
[ << ]
[ >> ]
[ Feedback ]
XXCOPY TECHNICAL BULLETIN #28
From: Kan Yabumoto tech@xxcopy.com
To: XXCOPY user
Subject: The Wild-Wild-Source: the source spec with wildcards
Date: 2001-01-28
====================================================================
XXCOPY Command Parameter Syntax:
XXCOPY source [ destination ] [ switches... ]
We have shown XXCOPY's basic command line syntax at numerous
occasions. In this article, the topic is focused on the first
item, the source specifier (any of the switch arguments can be
placed anywhere including at the position left to the source).
Source Specifier (XCOPY-compatible standard):
In another article, XXTB #25, the standard source
specifier that is compatible with Microsoft's XCOPY is discussed.
The standard source specifier is made of the following three parts.
[ volume_spec ] [ directory ] [ file_pattern ]
The other article discussed the case where the directory specifier
contains no wildcard character because Microsoft's XCOPY will treat
them literally (the * and ? has no special power as wildcard).
On the other hand, wildcard characters in the source specifier are
handled more appropriately by XXCOPY which is the subject of this
article.
The Wild-Wild-Source Specifier (XXCOPY-extended feature):
This is one of the most distinguishing feature of XXCOPY from most
other file management utilities. The source directory specifier
can be further separated in two sub-parts (compare with the standard,
three-part source specifier).
[ volume_spec ] [ base_dir ] [ directory_pattern ] [ file_pattern ]
The [ directory ] component in the standard specifier is now broken
up to [ base_dir ] and [ directory_pattern ]. The "constant" part
of the directory specifier which has no wildcard will be classfied
as the base_dir. The remaining part that include a wildcard will
be classified as the directory_pattern. Any of the four parts can
be omitted. But, of course at least one must be present as the
source specifier.
For example
XCOPY C:\Windows\sys*\*.dll D:\dst\ /S
According to the standard three-part scheme, it breaks up like
volume_spec: C:
directory: \Windows\sys*\
file_pattern: *.dll
Of course, with Microsoft's XCOPY, you get nothing by this command.
XCOPY looks for a directory, C:\Window\sys*\ which does not exist if
interpreted literally (XCOPY does just that) and find no matching
files (*.DLL).
With XXCOPY's wild-wild-source (four-part scheme) feature, it works as.
XCOPY C:\Windows\sys*\*.dll D:\dst\ /S
volume_spec: C:
base_dir: \Windows\
directory_pattern: sys*\
file_pattern: *.dll
The command line effectively combines the action previously done
with multiple lines like
XXCOPY C:\windows\system\*.dll d:\dst\system\ /S
XXCOPY C:\windows\system32\*.dll d:\dst\system\ /S
...
The Multi-level Subdirectory Specifier:
In various examples, you may have seen a source specifier like
XXCOPY C:\Windows\*\?cache*\*\*.jpg \dst\
Yes, XXCOPY's unique Wild-Wild-Source feature allows you to use
wildcards liberally pretty much anywhere in the source specifier.
That includes the new \*\ notation where a single asterisk forms
a sole level of directory. You can go really wild with this
feature of having as many wildcards anywhere, any level, any
number... It makes XXCOPY a very wild beast indeed.
The \*\ sequence is a new notation which we came up with XXCOPY
in order to encode the multi-level directory name matching.
Actually, the same concept has been present in Microsoft's XCOPY
in the form of the /S switch which specifies that a filename
pattern be applied to multiple-level subdirectories. For example,
XCOPY C:\Windows\*.jpg \dst\ /S
XXCOPY C:\Windows\*.jpg \dst\ /S
The /S switch is a very basic switch and most XCOPY/XXCOPY users
are familiar with this concept. It includes not only the first
level directory, but also includes all subdirectories.
C:\Windows\mywife.jpg // first-level directory
C:\Windows\cache\mother1.jpg // another-level
C:\Windows\cache\deep\son.jpg // third-level
...
* * * * OK, Microsoft's XCOPY runs out of gas here. * * * *
The rest of the discussion applies only to the XXCOPY utility.
Using the new \*\ notation, the /S switch can be substituted as
XXCOPY C:\Windows\*\*.jpg \dst\
In this command line, the \*\ sequence immediately before the
filename template (*.jpg) makes the files to be applied to all
subdirectories beyond the path (C:\Windows\). Therefore, the
*.jpg pattern applies to any subdirectories which is how the
/S switch works.
Next, I will show you even a better example of \*\ sequence which
illustrates a case which cannot be specified by the traditional /S
switch.
XXCOPY C:\Windows\*\cache\*.jpg \dst\
In this case, the subdirectory cache may appear at any level
of subdirectory (including the first level). Somewhat similar
to the spirit of the /S switch, but it does NOT allow the
last name part (*.jpg) to be matched in any other directory
level than the one immediately inside the cache\ directory.
Note the difference carefully: the \*\ sequence does not
appear between \cache\ and *.jpg.
Therefore,the following three cases are all different one another.
XXCOPY C:\Windows\*\*.jpg \dst\
XXCOPY C:\Windows\*\cache\*.jpg \dst\
XXCOPY C:\Windows\*\cache\*\*.jpg \dst\
The first line is equivalent to the familiar /S switch where
file pattern *.jpg applies to any level below C:\Windows\.
In the case of the second line, \*\ modifies the multi-level
matching of only the directory pattern, \cache\ (it just happens
that it contains no wildcard charcter, but it may be allowed).
But, the filename pattern, *.jpg applied only to the immediate
directory of whichever \cache\ directory.
The third case is the most universal case of all: the \*\
sequence appears in both before the directory pattern, \cache\,
and before the filename pattern, *.jpg.
Here are some variations of the multi-level directory specifier:
\*\ // zero or more levels of subdirectory
\?*\ // exactly one level of subdirectory of any name
\*\?*\ // one or more levels of subdirectory
There are no particular limit that is set by XXCOPY. You may
use as many wildcars you want in the source specifier. Of
course, there is a practical limit in the whole length of the
source specifier (260 character in all for a full pathname in
Windows).
Just for old-timer's finger habit:
For backward compatibility mostly to accommodate old timers' finger
habit, Microsoft allows *.* to denote any file (or directory) name
which may not necessarily has the dot character in it. To honor
the same tradition (and to make it fully XCOPY-compatible), XXCOPY
accepts *.* as equivalent to the simpler (and preferred) single-
asterisk, *. To be symmetrical, the multi-level subdirectory
matching sequence \*\ may be substitued by \*.*\. Similarly,
\*\*\ (or even \*\*\*\*) is a redundant (but permissible) expression
which will be treated as equivant to \*\,
What is the "Base Directory":
We call the "constant" part of the source directory in an XXCOPY
operation the Base Directory. There is always only one Base
Directory in XXCOPY command. In the traditional XCOPY-compatible
(without wildcard) source directory specifier, the pathname up to
the last name (the file_pattern) was the Base Directory. With
wildcard specifiers in the source specifier, the Base Directory
refers to the first part of the source specifier which does not
contain any wildcard character. This is why there is always only
one Base Directory.
The distinction of the Base Directory from the directory_pattern is
significant not for the name's sake. But, it is the directory
level which is the bas directory to which a relative path is
referenced. The Base Directory is used in both the formation of
the destination directory and the referece point for an exclusion
(/X) directory.
For example, using the same command line showen earlier:
XXCOPY C:\Windows\*\*cache*\*.jpg D:\dst\ /I
In the destination directory, you will find files like...
C:\Windows\abc\mycache\xrated.jpg --> D:\dst\abc\mycache\xrated.jpg
C:\Windows\a\b\cachex\xxx_pic.jpg --> D:\dst\cache\pta_oked.jpg
C:\Windows\cache\pta_oked.jpg --> D:\dst\cache\pta_oked.jpg
(The /I switch let a new directory to be created if missing).
The Base Directory in this case is the
C:\Windows\
which is the longest source directory path which does not contain
a wildcard. So, if you have a relative referece in an exclusion
switch, the path will will be relative to the Base directory.
For instance,
XXCOPY C:\Windows\*\*cache*\*.jpg D:\dst\ /Xcache*\
Here, the exclusion specifier (/Xcache*\) gives the pattern for
the directories to be excluded as "cache*\" which is relative to
the Base Directory. that is C:\Windows\cache*\. And the line
XXCOPY C:\Windows\*\*cache*\*.jpg D:\dst\ /XC:\Windows\cache*\
In the above example, the following file would be caught by the
exclusion specifier.
C:\Windows\cache\pta_oked.jpg
Does the Wild-Wild-Source scheme apply to the exclude swich?
Unfortunately, the answer is NO. The exclusion specifier is
not implemted as flexibly as that of the source directory
specifier. It is mostly the for the sake of reasonable issue.
If the exclusion specifiers are given a total freedom in terms of
the placement of wildcard characters just like the source
specifier, unless we come up with a very clever algorithm,
the combinatorial explosion will be so severe, the operation
will be intorelably slow it will not be useful --- that is
our official excuse at least. On the other hand, the current
set of exclusion feature is chosen in such a way that the
overall XXCOPY performance will not severely compromized even
by a very large number of exclusion specifiers. Currently,
the use of wildcard in an excluded item is limited to the
last name (either file or directory) portion of the specifier.
© Copyright 2002 Pixelab, Inc. All rights reserved.
