[ Back to Table of Contents ]
[ << ]
[ >> ]
[ Feedback ]
XXCOPY TECHNICAL BULLETIN #05
From: Kan Yabumoto tech@xxcopy.com
To: XXCOPY user
Subject: The Exclusion specifier in XXCOPY
Date: 2000-12-01 (revised)
====================================================================
Much of the mostly hidden power of XXCOPY lies in the exclusion
mechanism. We identified the /X switch to be one of the most
important enhancements we made in XXCOPY. Because it is a
complex scheme with many implied rules, one cannot effectively
use the full potential of the exclusion feature without a detailed
explanation of the full scope of the syntax as well as the way
the exclusion scheme is implemented. This article will discuss
all the rules applied to the exclusion feature.
XXCOPY Exclusion switch syntax
/X<xspec> excludes the file or directory item given by
<xspec> which is an exclusion specifier.
If the specifier contains an embedded space,
the specifier must be surrounded by a pair
of double-quotes (").
/EX<xfile> specifies a text file whose name is <xfile>
which contains a list of <xspec> separated by space.
/ZX ignores the environment variable, "XXCOPYX".
XXCOPYX The environment variable XXCOPYX specifies a
(env var) list of <xspec> which are separated by a space.
XXCOPY The environment variable XXCOPY specifies a
(env var) list of XXCOPY switches which may be /X<xspec>.
Note that the difference between the two environment variables,
XXCOPY and XXCOPYX is that every item in the XXCOPY value
must be prefixed with a slash (/) followed by an XXCOPY switch
(which can be for any XXCOPY switch) whereas XXCOPYX values are
strictly for the /X switch as a list of exclusion specifiers in
order to save space.
You may specify as many exclusion specifiers as you like.
Some examples of the /X switches
/Xc:\mydir\myfile.txt // specifies just a single file
/X*.tmp // all files that end with ".tmp"
/Xabc* // all files that start with "abc"
/Xmydir\ // the entire directory, "mydir" in the source
/Xmydir\*\* // same as /Xmydir\ which is a shortcut
/Xmydir\*\*.tmp // inside mydir, all files matching "*.tmp"
/Xmy*xyz\*\abc*.c // inside mydir, all files matching "abc*.c"
/X*\cache\ // multiple-level subdirectories
/X*\cache\*\* // same as above with a trailing backslash
/X*\cach?\*\* // multiple-level subdir spec may have wildcards
Here, you may see the glimpse of the powerful syntax in the exclusion
specifier. The first example seems the most straight forward. The
fourth example which ends with a backslash is a shorthand of for the
common case of excluding a directory (it abbreviates "*\*" which follows).
Therefore, all of the above examples except the first one contain
or imply at least one wildcard specifier. The last example includes
one asterisks in each of the three parts.
Don't worry about the complexity yet. At least the first example shows
a case which you can use it immediately without any further reading.
Yes, if you have energy to list all of the files you want to exclude,
you may painstakingly list all of such files by giving the full
file specification of each file. Since you will soon run out of the
command line space, you will probably want to set up a text file using
the /EX switch.
E.g., /EXmyexcl.lst
and myexcl.lst contains the following specifiers:
:: this is a comment line
c:\win386.swp :: comment may start like this
c:\autoexec.bat myfile.tmp :: one line may have multiple items
"c:\program files" :: use quotes (") for embedded space
mydir\myfile.txt :: pathspec relative to the source dir
yourdir\ :: entire yourdir\*\*
Syntax rule for the Exclusion List File.
An "Exclusion List File" specified in the /EX switch is a plain
text file which contains a list of exclusion specifiers.
You may list as many exclusion specifiers in one line.
Exclusion specifiers are separated by one or more blank, tab,
and/or newline character. An exclusion specifier cannot be
broken into two or more lines. When a space character is
embedded, the exclusion specifier must be surrounded by a
pair of double-quotes ("). A line may contain a comment field
which will be ignored by XXCOPY. A comment field starts with
two consecutive colons (::) and ends at the end of the line.
We suggest the use of a line for each exclusion specifier which
is followed by a comment.
Definition of the exclusion specifier.
Up to now, the exact meaning of the exclusion specifier has not
been defined. Now, we are going to analyze the syntax and its
meaning to its death. (Note: the exclusion specifier has been
revised on 2000-10-09 with the addition of the multiple-level
subdirectory exclusion feature).
The exclusion specifier has up to three parts.
[ dir_spec\ ] [ *\ ] [ template ]
Although any of the three parts can be omitted, you must not skip
both dir_spec and template at the same time. Note that the last
part (template) can be either a file-template or a directory-
template which will be explained below with more details.
Directory specifier ( dir_spec )
The dir_spec part specifies the base directory of the exclusion.
It is always followed by a backslash (\) character.
The directory can be specified in an absolute path (starting with
the root directory), or a relative path (without a leading
backslash) which is treated as relative to the source directory
(not the "current" directory).
The dir_spec may contain a wildcard specification in its
last part. For example
/Xc:\mydir\level1\abc*\*\template
/Xc:\mydir\level1\a*bc*.?oc\*\template
In both of the examples here, the last part of the directory
specifier (after \level1\) has asterisk(s) in it. The second
example goes one step farther by allowing multiple asterisks
and even a question mark which is another wildcard for a single
letter.
The middle part (*\)
It denotes that the exclusion specification will be applied
not only to the dir_spec directory, but also to all of the
subdirectories underneath. It is equivalent of the familiar
/S switch which is applied to modify the source specifier
meaning that the XXCOPY action will include all subdirectories.
Since we do not have the luxury of a separate /S switch on each
exclusion items, we invented this notation which figuratively
suggests the fact that the directory starts with dir_spec,
ends with the template and anything in between is accepted.
The following two examples highlight the effect of the middle part.
/Xmydir\myfile.* // myfile in mydir\ only
/Xmydir\*\myfile.*c // myfile in every directories under mydir\
Template specifier ( template )
The last part of the exclusion specifier is the template which
may be either a file-template or a directory-template. So, the
exclusion specifier can be more precisely described by using the
following two notations:
[ dir_spec\ ] [ *\ ] [ filetemplate ]
[ dir_spec\ ] [ *\ ] [ dirtemplate ]
Here, the syntactic distinction of the two types is made by
the ending of the template string.
Common shortcut notations of the exclusion specifier.
File template
When a lone template is specified without a trailing backslash,
(e.g., /Xmyfile.txt ), it is treated as a shortcut for a
multiple-level filename template which is equivalent to
/X*\myfile.txt). This is mostly for historic reason,
(also, the frequency of this type of usage justifies it).
If you need to specify a one-level filename template, you
should place the dot directory (denoting the current directory)
to distinguish it from the multiple-level case ( /X.\myfile.txt ).
Examples:
/Xtemplate // file which matches the template inside
// the current (src) directory (Multil-Level).
/X*\template // the template applies to all subdirectories
// this is same as above (Multi-Level)
/X.\template // the dot denotes relative to the base (src)
// directory (1-Level)
Directory template
The directory template may have the following four variations
in the ending.
dirtemplate\ // full directory
dirtemplate\*\* // same thing with alternate notation
dirtemplate\* // file in the directory (one-level)
dirtemplate\?\* // all subdirectories but not
// the first-level files
The first two notations are interchangeable and denote
the whole directory. And the third and fourth cases are
partial directory notations (when the two are combined,
it will match the whole directory.
Examples:
/Xdirtmpl\*\* // excludes all matching directories in the
// base (src) directory and its contents
/Xdirtmpl\ // same as above (the triling backslash
// denotes everything inside the directory)
/X.\dirtmpl\ // in the case of the directory template,
// it applies to one directory relative to
// the base (src) directory (1-Level)
/x*\dirtmp\ // you may make a directory template apply
// to many instances (Multi-Level)
/xc:\windows\* // specifies all the files in the first
// level of the c:\Windows directory such
// as, EXPLORER.EXE, WIN.INI, COMMAND.COM
/xc:\windows\?\* // this does not includes the first level
// files but all subdirectories in it such
// as \WINDOWS\SYSTEM\ \WINDOWS\DESKTOP\ etc.
Since both dir_spec and dirtemplate may contain wildcards,
it could be as complex as...
/Xc:\mydir\pat*ern\*\dir???\*\*
This one excludes all subdirectories which starts by "dir"
followed by three characters which appear in any level of
subdirectory under any directory inside c:\bydir whose
name match "pat*ern".
Note that the following two are distinct:
/Xdir_spec\* // one layer only (subdirectories not excluded)
/Xdir_spec\*\* // the entire dir_spec directory is excluded
XXCOPY allows you to exclude either the entire subdirectory
(which affects both files and directories of any level), or
one directory layer (which affects only files in the immediate
level but not subdirectories).
The variations in exclusion specifiers (11 cases)
The exclusion specifier may be classified into the following
eleven classes (A - K).
simple cases 1-Level templates Multi-Level templates
-------------------------------------------------------------------
D dir_spec\filetmpl H dir_spec\*\filetmpl
A dir_spec\* E dir_spec\dir_tmpl\* I dir_spec\*\dir_tmpl\*
B dir_spec\?\* F dir_spec\dir_tmpl\?\* J dir_spec\*\dir_tmpl\?\*
C dir_spec\*\* G dir_spec\dir_tmpl\*\* K dir_spec\*\dir_tmpl\*\*
Note that a dir_spec may be specified with wildcard characters
in the last component level. For example,
c:\mydir\Level2\last?level\* // simple case
c:\mydir\Level2\last?level\template\ // 1-level case
c:\mydir\Level2\last?level\*\template\ // multi-level
Also, the file_template or directory_template may contain
wildcard characters.
c:\mydir\L2\last?level\file?template // simple filepattern
c:\mydir\L2\last?level\dir?template\ // whole directory
c:\mydir\L2\last?level\*\dir?template\* // 1-level files
c:\mydir\L2\last?level\*\dir?template\?\* // Multi-level case
Here, to illustrate the wildcard in the respective compoents,
a questionmark(?) was added where a wildcard is permitted
(last?level\, file?template or dir?template).
Note that whereas the dir_spec shown above may consists of many
levels of directories, the template specifiers (dir_tmpl) in
Groups I, J and, K must be a single-level directory template
(without a backslash inside).
The optimization of exclusion matching.
In a very large scale backup operation, an XXCOPY job may encompass
an entire volume as the source directory (such as c:\*). To make
the matters worse, the more files the source directory contains,
the more the need for the exclusion specifiers grows. Therefore,
it is entirely possible that the entire C: drive may contain
70,000 files and the total number of exclusion items the user specify
in the exclusion list file with the /EX switch may contain literally
hundreds of various exclusion specifiers. If we were to test every
file against this very large number of exclusion list, the combination
will easily reach tens of millions which would slow down the entire
backup process. Therefore, XXCOPY performs preprocessing steps
to analyze the set of exclusion specifiers. First by classifying
them into the five classes, some redundant exclusion specifiers can
be removed. For example, if a dir_spec is specified in Class B,
any subdirectories of the same directory in Classes C, D, E, or F
regardless of the template will be automatically excluded because
the same directory in Class B spec overshadow any subset of the
directory. Moreover, in the actual XXCOPY implementation, the
the active file pattern matching templates is computed to each
subdirectory to reduce the number of filename matching to
eliminate a significant number of redundant combinations.
Debug feature
Because of the complexities of the exclusion parameters when the
number of exclusion specifiers become substantially large, you may
analyze the list of exclusion parameters immediately after the
initial exclusion parameter optimization steps are completed by the
following two debug switches:
/DEBUG // displays the parameters and prompt for continuation
/DEBUGX // displays the parameters and exit XXCOPY.
/OX // outputs the exclusion parameters in the log file
/OP // outputs the regular parameters in the log file.
/OX/W // a convenient switch to test the exclusion settings
Automatically excluded files.
Since the few output files (e.g., the error log files) which are
generated by the XXCOPY program itself could not be successfully
included in the current copying job if any of them happens to be
in the source directory (or its subdirectories), those files will
be always excluded implicitly.
© Copyright 2002 Pixelab, Inc. All rights reserved.
