wcmatch.pathlib
from wcmatch import pathlib
New 5.0
wcmatch.pathlib
was added in wcmatch
5.0.
Overview
pathlib
is a library that contains subclasses of Python's pathlib
Path
and PurePath
classes, and their Posix and Windows subclasses, with the purpose of overriding the default glob
behavior with Wildcard Match's very own glob
. This allows a user of pathlib
to use all of the glob enhancements that Wildcard Match provides. This includes features such as extended glob patterns, brace expansions, and more.
This documentation does not mean to exhaustively describe the pathlib
library, just the differences introduced by Wildcard Match's implementation. Please check out Python's pathlib
documentation to learn more about pathlib
in general. Also, to learn more about the underlying glob library being used, check out the documentation for Wildcard Match's glob
.
Multi-Pattern Limits
Many of the API functions allow passing in multiple patterns or using either BRACE
or SPLIT
to expand a pattern in to more patterns. The number of allowed patterns is limited 1000
, but you can raise or lower this limit via the keyword option limit
. If you set limit
to 0
, there will be no limit.
New 6.0
The imposed pattern limit and corresponding limit
option was introduced in 6.0.
Differences
The API is the same as Python's default pathlib
except for the few differences related to file globbing and matching:
-
Each
pathlib
object'sglob
,rglob
, andmatch
methods are now driven by thewcmatch.glob
library. As a result, some of the defaults and accepted parameters are different. Also, many new optional features can be enabled via flags. -
glob
,rglob
, andmatch
can take a single string pattern or a list of patterns. They also accept flags via theflags
keyword. This matches the interfaces found detailed in ourglob
documentation. -
glob
,rglob
, andmatch
do not enableGLOBSTAR
orDOTGLOB
by default. These flags must be passed in to take advantage of this functionality. -
A
globmatch
function has been added toPurePath
classes (andPath
classes which are derived fromPurePath
) which is likematch
except performs a "full" match. Python 3.13 added a similar function calledfull_match
which came long after ourglobmatch
support was added. In recent versions we've also addedfull_match
as an alias to ourglobmatch
function. Seematch
,globmatch
, andfull_match
for more information. -
If file searching methods (
glob
andrglob
) are given multiple patterns, they will ensure duplicate results are filtered out. This only occurs when more than one inclusive pattern is given, or a pattern is expanded into multiple, inclusive patterns viaBRACE
orSPLIT
. When this occurs, an internal set is kept to track the results returned so that duplicates can be filtered. This will not occur if only a single, inclusive pattern is given or theNOUNIQUE
flag is specified. -
Python's
pathlib
has logic to ignore.
when used as a directory in both the file path and glob pattern. We do not alter howpathlib
stores paths, but our implementation allows explicit use of.
as a literal directory and will match accordingly. With that said, sincepathlib
normalizes paths by removing.
directories, in most cases, you won't notice the difference, except when it comes to a path that is literally just.
.Python's default glob:
>>> import pathlib >>> list(pathlib.Path('.').glob('docs/./src')) [PosixPath('docs/src')]
Ours:
>>> form wcmatch import pathlib >>> list(pathlib.Path('.').glob('docs/./src')) [PosixPath('docs/src')]
Python's default glob:
>>> import pathlib >>> pathlib.Path('.').match('.') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/Cellar/python@3.8/3.8.3/Frameworks/Python.framework/Versions/3.8/lib/python3.8/pathlib.py", line 976, in match raise ValueError("empty pattern") ValueError: empty pattern
Ours:
>>> from wcmatch import pathlib >>> pathlib.Path('.').match('.') True
Similarities
-
glob
,rglob
, andmatch
should mimic the basic behavior of Python's originalpathlib
library, just with the enhancements and configurability that Wildcard Match'sglob
provides. -
rglob
will exhibit the same recursive behavior. -
match
will match using the same recursive behavior asrglob
.
Classes
pathlib.PurePath
PurePath
is Wildcard Match's version of Python's PurePath
class. Depending on the system, it will create either a PureWindowsPath
or a PurePosixPath
object. Both objects will utilize wcmatch.glob
for all glob related actions.
PurePath
objects do not touch the filesystem. They include the methods match
and globmatch
(amongst others). You can force the path to access the filesystem if you give either function the REALPATH
flag. We do not restrict this, but we do not enable it by default. REALPATH
simply forces the match to check the filesystem to see if the file exists and is a directory or not.
>>> from wcmatch import pathlib
>>> pathlib.PurePath('docs/src')
PurePosixPath('docs/src')
PurePath
classes implement the match
and globmatch
methods:
>>> from wcmatch import pathlib
>>> p = pathlib.PurePath('docs/src')
>>> p.match('src')
True
>>> p.globmatch('**/src', flags=pathlib.GLOBSTAR)
True
pathlib.PureWindowsPath
PureWindowsPath
is Wildcard Match's version of Python's PureWindowsPath
. The PureWindowsPath
class is useful if you'd like to have the ease that pathlib
offers when working with a path, but don't want it to access the filesystem. This is also useful if you'd like to manipulate Windows path strings on a Posix system. This class will utilize Wildcard Match's glob
for all glob related actions. The class is subclassed from PurePath
.
>>> from wcmatch import pathlib
>>> os.name
'posix'
>>> pathlib.PureWindowsPath('c:/some/path')
PureWindowsPath('c:/some/path')
pathlib.PurePosixPath
PurePosixPath
is Wildcard Match's version of Python's PurePosixPath
. The PurePosixPath
class is useful if you'd like to have the ease that pathlib
offers when working with a path, but don't want it to access the filesystem. This is also useful if you'd like to manipulate Posix path strings on a Windows system. This class will utilize Wildcard Match's glob
for all glob related actions. The class is subclassed from PurePath
.
>>> from wcmatch import pathlib
>>> os.name
'nt'
>>> pathlib.PureWindowsPath('/usr/local/bin')
PurePosixPath('/usr/local/bin')
pathlib.Path
Path
is Wildcard Match's version of Python's Path
class. Depending on the system, it will create either a WindowsPath
or a PosixPath
object. Both objects will utilize wcmatch.glob
for all glob related actions.
Path
classes are subclassed from the PurePath
objects, so you get all the features of the Path
class in addition to the PurePath
class features. Path
objects have access to the filesystem. They include the PurePath
methods match
and globmatch
(amongst others). Since these methods are PurePath
methods, they do not touch the filesystem. But, you can force them to access the filesystem if you give either function the REALPATH
flag. We do not restrict this, but we do not enable it by default. REALPATH
simply forces the match to check the filesystem to see if the file exists and is a directory or not.
>>> from wcmatch import pathlib
>>> pathlib.PurePath('docs/src')
PosixPath('docs/src')
Path
classes implement the glob
and globmatch
methods:
>>> from wcmatch import pathlib
>>> p = pathlib.Path('docs/src')
>>> p.match('src')
True
>>> p.globmatch('**/src', flags=pathlib.GLOBSTAR)
True
>>> list(p.glob('**/*.txt', flags=pathlib.GLOBSTAR))
[PosixPath('docs/src/dictionary/en-custom.txt'), PosixPath('docs/src/markdown/_snippets/links.txt'), PosixPath('docs/src/markdown/_snippets/refs.txt'), PosixPath('docs/src/markdown/_snippets/abbr.txt'), PosixPath('docs/src/markdown/_snippets/posix.txt')]
>>> list(p.rglob('*.txt'))
[PosixPath('docs/src/dictionary/en-custom.txt'), PosixPath('docs/src/markdown/_snippets/links.txt'), PosixPath('docs/src/markdown/_snippets/refs.txt'), PosixPath('docs/src/markdown/_snippets/abbr.txt'), PosixPath('docs/src/markdown/_snippets/posix.txt')]
pathlib.WindowsPath
WindowsPath
is Wildcard Match's version of Python's WindowsPath
. The WindowsPath
class is useful if you'd like to have the ease that pathlib
offers when working with a path and be able to manipulate or gain access to to information about that file. You cannot instantiate this class on a Posix system. This class will utilize Wildcard Match's glob
for all glob related actions. The class is subclassed from Path
.
>>> from wcmatch import pathlib
>>> os.name
'posix'
>>> pathlib.Path('c:/some/path')
WindowsPath('c:/some/path')
pathlib.PosixPath
PosixPath
is Wildcard Match's version of Python's PosixPath
. The PosixPath
class is useful if you'd like to have the ease that pathlib
offers when working with a path and be able to manipulate or gain access to to information about that file. You cannot instantiate this class on a Windows system. This class will utilize Wildcard Match's glob
for all glob related actions. The class is subclassed from Path
.
>>> from wcmatch import pathlib
>>> os.name
'posix'
>>> pathlib.Path('/usr/local/bin')
PosixPath('/usr/local/bin')
Methods
PurePath.match
def match(self, patterns, *, flags=0, limit=1000, exclude=None):
match
takes a pattern (or list of patterns), and flags. It also allows configuring the max pattern limit. Exclusion patterns can be specified via the exclude
parameter which takes a pattern or a list of patterns. It will return a boolean indicating whether the object's file path was matched by the pattern(s).
match
mimics Python's pathlib
version of match
. Python's match
uses a right to left evaluation that behaves like rglob
but as a matcher instead of a globbing function. Wildcard Match emulates this behavior as well. What this means is that when provided with a path some/path/name
, the patterns name
, path/name
and some/path/name
will all match. Essentially, it matches what rglob
returns.
match
does not access the filesystem, but you can force the path to access the filesystem if you give it the REALPATH
flag. We do not restrict this, but we do not enable it by default. REALPATH
simply forces the match to check the filesystem to see if the file exists, if it is a directory or not, and whether it is a symlink.
Since Path
is derived from PurePath
, this method is also available in Path
objects.
>>> from wcmatch import pathlib
>>> p = pathlib.PurePath('docs/src')
>>> p.match('src')
True
New 6.0
limit
was added in 6.0.
New 8.4
exclude
parameter was added.
PurePath.globmatch
def globmatch(self, patterns, *, flags=0, limit=1000, exclude=None):
globmatch
takes a pattern (or list of patterns), and flags. It also allows configuring the max pattern limit. Exclusion patterns can be specified via the exclude
parameter which takes a pattern or a list of patterns. It will return a boolean indicating whether the objects file path was matched by the pattern(s).
globmatch
is similar to match
except it does not use the same recursive logic that match
does. In all other respects, it behaves the same.
globmatch
does not access the filesystem, but you can force the path to access the filesystem if you give it the REALPATH
flag. We do not restrict this, but we do not enable it by default. REALPATH
simply forces the match to check the filesystem to see if the file exists, if it is a directory or not, and whether it is a symlink.
Since Path
is derived from PurePath
, this method is also available in Path
objects.
>>> from wcmatch import pathlib
>>> p = pathlib.PurePath('docs/src')
>>> p.globmatch('**/src', flags=pathlib.GLOBSTAR)
True
New 6.0
limit
was added in 6.0.
New 8.4
exclude
parameter was added.
PurePath.full_match
new 10.0
def full_match(self, patterns, *, flags=0, limit=1000, exclude=None):
Python 3.13 added the new full_match
method to PurePath
objects. Essentially, this does for normal pathlib
what our existing PurePath.globmatch
has been doing prior to Python 3.13. We've added an alias for PurePath.full_match
that redirects to PurePath.globmatch
for completeness.
Path.glob
def glob(self, patterns, *, flags=0, limit=1000, exclude=None):
glob
takes a pattern (or list of patterns) and flags. It also allows configuring the max pattern limit. It will crawl the file system, relative to the current Path
object, returning a generator of Path
objects. If a file/folder matches any regular, inclusion pattern, it is considered a match. If a file matches any exclusion pattern (specified via exclude
or using negation patterns when enabling the NEGATE
flag), then it will not be returned.
This method calls our own iglob
implementation, and as such, should behave in the same manner in respect to features, the one exception being that instead of returning path strings in the generator, it will return Path
objects.
The one difference between this glob
and the iglob
API is that this function does not accept the root_dir
parameter. All searches are relative to the object's path, which is evaluated relative to the current working directory.
>>> from wcmatch import pathlib
>>> p = pathlib.Path('docs/src')
>>> list(p.glob('**/*.txt', flags=pathlib.GLOBSTAR))
[PosixPath('docs/src/dictionary/en-custom.txt'), PosixPath('docs/src/markdown/_snippets/links.txt'), PosixPath('docs/src/markdown/_snippets/refs.txt'), PosixPath('docs/src/markdown/_snippets/abbr.txt'), PosixPath('docs/src/markdown/_snippets/posix.txt')]
New 6.0
limit
was added in 6.0.
New 8.4
exclude
parameter was added.
Path.rglob
def rglob(self, patterns, *, flags=0, path_limit=1000, exclude=None):
rglob
takes a pattern (or list of patterns) and flags. It also allows configuring the max pattern limit. It will crawl the file system, relative to the current Path
object, returning a generator of Path
objects. If a file/folder matches any regular patterns, it is considered a match. If a file matches any exclusion pattern (specified via exclude
or using negation patterns when enabling the NEGATE
flag), then it will be not be returned.
rglob
mimics Python's pathlib
version of rglob
in that it uses a recursive logic. What this means is that when you are matching a path in the form some/path/name
, the patterns name
, path/name
and some/path/name
will all match. Essentially, the pattern behaves as if a GLOBSTAR
pattern of **/
was added at the beginning of the pattern.
rglob
is similar to glob
except for the use of recursive logic. In all other respects, it behaves the same.
>>> from wcmatch import pathlib
>>> p = pathlib.Path('docs/src')
>>> list(p.rglob('*.txt'))
[PosixPath('docs/src/dictionary/en-custom.txt'), PosixPath('docs/src/markdown/_snippets/links.txt'), PosixPath('docs/src/markdown/_snippets/refs.txt'), PosixPath('docs/src/markdown/_snippets/abbr.txt'), PosixPath('docs/src/markdown/_snippets/posix.txt')]
New 6.0
limit
was added in 6.0.
New 8.4
exclude
parameter was added.
Flags
pathlib.CASE, pathlib.C
CASE
forces case sensitivity. CASE
has higher priority than IGNORECASE
.
On Windows, drive letters (C:
) and UNC sharepoints (//host/share
) portions of a path will still be treated case insensitively, but the rest of the path will have case sensitive logic applied.
pathlib.IGNORECASE, pathlib.I
IGNORECASE
forces case insensitivity. CASE
has higher priority than IGNORECASE
.
glob.RAWCHARS, glob.R
RAWCHARS
causes string character syntax to be parsed in raw strings: r'\u0040'
→ r'@'
. This will handle standard string escapes and Unicode including r'\N{CHAR NAME}'
.
pathlib.NEGATE, pathlib.N
NEGATE
causes patterns that start with !
to be treated as exclusion patterns. A pattern of !*.py
would exclude any Python files. Exclusion patterns cannot be used by themselves though, and must be paired with a normal, inclusion pattern, either by utilizing the SPLIT
flag, or providing multiple patterns in a list. Assuming the SPLIT
flag, this means using it in a pattern such as inclusion|!exclusion
.
If it is desired, you can force exclusion patterns, when no inclusion pattern is provided, to assume all files match unless the file matches the excluded pattern. This is done with the NEGATEALL
flag.
NEGATE
enables DOTGLOB
in all exclude patterns, this cannot be disabled. This will not affect the inclusion patterns.
If NEGATE
is set and exclusion patterns are passed via a matching or glob function's exclude
parameter, NEGATE
will be ignored and the exclude
patterns will be used instead. Either exclude
or NEGATE
should be used, not both.
pathlib.NEGATEALL, pathlib.A
NEGATEALL
can force exclusion patterns, when no inclusion pattern is provided, to assume all files match unless the file matches the excluded pattern. Essentially, it means if you use a pattern such as !*.md
, it will assume two patterns were given: **
and !*.md
, where !*.md
is applied to the results of **
, and **
is specifically treated as if GLOBSTAR
was enabled.
Dot files will not be returned unless DOTGLOB
is enabled. Symlinks will also be ignored in the return unless FOLLOW
is enabled.
pathlib.MINUSNEGATE, pathlib.M
When MINUSNEGATE
is used with NEGATE
, exclusion patterns are recognized by a pattern starting with -
instead of !
. This plays nice with the extended glob feature which already uses !
in patterns such as !(...)
.
pathlib.GLOBSTAR, pathlib.G
GLOBSTAR
enables the feature where **
matches zero or more directories.
glob.GLOBSTARLONG, glob.GL
New 10.0
When GLOBSTARLONG
is enabled ***
will act like **
, but will cause symlinks to be traversed as well.
Enabling GLOBSTARLONG
automatically enables GLOBSTAR
.
FOLLOW
will be ignored and ***
will be required to traverse a symlink. But it should be noted that when using MATCHBASE
and FOLLOW
with GLOBSTARLONG
, that FOLLOW
will cause the implicit leading **
that MATCHBASE
applies to act as an implicit ***
.
pathlib.FOLLOW, pathlib.L
FOLLOW
will cause GLOBSTAR
patterns (**
) to match and traverse symlink directories.
FOLLOW
will have no affect if using GLOBSTARLONG
and an explicit ***
will be required to traverse a symlink. FOLLOW
will have an affect if enabled with GLOBSTARLONG
and MATCHBASE
and will cause the implicit leading **
that MATCHBASE
applies to act as an implicit ***
.
pathlib.REALPATH, pathlib.P
In the past, only glob
and iglob
operated on the filesystem, but with REALPATH
, other functions will now operate on the filesystem as well: globmatch
and match
.
Normally, functions such as globmatch
would simply match a path with regular expression and return the result. The functions were not concerned with whether the path existed or not. It didn't care if it was even valid for the operating system.
REALPATH
forces globmatch
and match
to treat the path as a real file path for the given system it is running on. It will augment the patterns used to match files and enable additional logic so that the path must meet the following in order to match:
- Path must exist.
- Directories that are symlinks will not be matched by
GLOBSTAR
patterns (**
) unless theFOLLOW
flag is enabled. - If
GLOBSTARLONG
is enabled,***
will traverse symlinks,FOLLOW
will be ignored except ifMATCHBASE
is also enabled, in that case, the implicit leading**
added byMATCHBASE
will act as***
. This also affects the implicit leading**
adding byrglob
. - When presented with a pattern where the match must be a directory, but the file path being compared doesn't indicate the file is a directory with a trailing slash, the command will look at the filesystem to determine if it is a directory.
- Paths must match in relation to the current working directory unless the pattern is constructed in a way to indicates an absolute path.
pathlib.DOTGLOB, pathlib.D
By default, globbing and matching functions will not match file or directory names that start with dot .
unless matched with a literal dot. DOTGLOB
allows the meta characters (such as *
) to glob dots like any other character. Dots will not be matched in []
, *
, or ?
.
Alternatively DOTMATCH
will also be accepted for consistency with the other provided libraries. Both flags are exactly the same and are provided as a convenience in case the user finds one more intuitive than the other since DOTGLOB
is often the name used in Bash.
pathlib.NODOTDIR, glob.Z
NOTDOTDIR
fundamentally changes how glob patterns deal with .
and ..
. This is great if you'd prefer a more Zsh feel when it comes to special directory matching. When NODOTDIR
is enabled, "magic" patterns, such as .*
, will not match the special directories of .
and ..
. In order to match these special directories, you will have to use literal glob patterns of .
and ..
. This can be used in all glob API functions that accept flags, and will affect inclusion patterns as well as exclusion patterns.
>>> from wcmatch import pathlib
>>> pathlib.Path('..').match('.*')
True
>>> pathlib.Path('..').match('.*', flags=pathlib.NODOTDIR)
False
>>> pathlib.Path('..').match('..', flags=pathlib.NODOTDIR)
True
Also affects exclusion patterns:
>>> from wcmatch import pathlib
>>> list(pathlib.Path('.').glob(['docs/..', '!*/.*'], flags=pathlib.NEGATE))
[]
>>> list(pathlib.Path('.').glob(['docs/..', '!*/.*'], flags=pathlib.NEGATE | pathlib.NODOTDIR))
[PosixPath('docs/..')]
>>> list(pathlib.Path('.').glob(['docs/..', '!*/..'], flags=pathlib.NEGATE | pathlib.NODOTDIR))
[]
New 7.0
NODOTDIR
was added in 7.0.
pathlib.SCANDOTDIR, pathlib.SD
Not recommended for pathlib
pathlib
supports all of the same flags that the wcmatch.glob
library does. But due to how pathlib
normalizes the paths that get returned, enabling SCANDOTDIR
will only give confusing duplicates if using patterns such as .*
. This is not a bug, but is something to be aware of.
SCANDOTDIR
controls the directory scanning behavior of glob
and rglob
. The directory scanner of these functions do not return .
and ..
in their results. This means unless you use an explicit .
or ..
in your glob pattern, .
and ..
will not be returned. When SCANDOTDIR
is enabled, .
and ..
will be returned when a directory is scanned causing "magic" patterns, such as .*
, to match .
and ..
.
This only controls the directory scanning behavior and not how glob patterns behave. Exclude patterns, which filter, the returned results via NEGATE
, can still match .
and ..
with "magic" patterns such as .*
regardless of whether SCANDOTDIR
is enabled or not. It will also have no affect on globmatch
. To fundamentally change how glob patterns behave, you can use NODOTDIR
.
>>> from wcmatch import pathlib
>>> list(pathlib.Path('temp').glob('**/.*', flags=glob.GLOBSTAR | glob.DOTGLOB))
[PosixPath('temp/.hidden'), PosixPath('temp/.DS_Store')]
>>> list(pathlib.Path('temp').glob('**/.*', flags=pathlib.GLOBSTAR | pathlib.DOTGLOB | pathlib.SCANDOTDIR))
[PosixPath('temp'), PosixPath('temp/..'), PosixPath('temp/.hidden'), PosixPath('temp/.hidden/..'), PosixPath('temp/.DS_Store')]
Notice when we turn off unique result filtering how we get multiple temp/.hidden
results. This is due to how pathlib
normalizes directories. When comparing the results to a non-pathlib
glob, the results make a bit more sense.
>>> list(pathlib.Path('temp').glob('**/.*', flags=pathlib.GLOBSTAR | pathlib.DOTGLOB | pathlib.SCANDOTDIR | pathlib.NOUNIQUE))
[PosixPath('temp'), PosixPath('temp/..'), PosixPath('temp/.hidden'), PosixPath('temp/.hidden'), PosixPath('temp/.hidden/..'), PosixPath('temp/.DS_Store')]
>>> list(glob.glob('**/.*', flags=glob.GLOBSTAR | glob.DOTGLOB | glob.SCANDOTDIR, root_dir="temp"))
['.', '..', '.hidden', '.hidden/.', '.hidden/..', '.DS_Store']
New 7.0
SCANDOTDIR
was added in 7.0.
pathlib.EXTGLOB, pathlib.E
EXTGLOB
enables extended pattern matching which includes special pattern lists such as +(...)
, *(...)
, ?(...)
, etc. Pattern lists allow for multiple patterns within them separated by |
. See the globbing syntax overview for more information.
Alternatively EXTMATCH
will also be accepted for consistency with the other provided libraries. Both flags are exactly the same and are provided as a convenience in case the user finds one more intuitive than the other since EXTGLOB
is often the name used in Bash.
EXTGLOB and NEGATE
When using EXTGLOB
and NEGATE
together, if a pattern starts with !(
, the pattern will not be treated as a NEGATE
pattern (even if !(
doesn't yield a valid EXTGLOB
pattern). To negate a pattern that starts with a literal (
, you must escape the bracket: !\(
.
pathlib.BRACE, pathlib.B
BRACE
enables Bash style brace expansion: a{b,{c,d}}
→ ab ac ad
. Brace expansion is applied before anything else. When applied, a pattern will be expanded into multiple patterns. Each pattern will then be parsed separately.
Duplicate patterns will be discarded1 by default, and glob
and rglob
will return only unique results. If you need glob
or rglob
to behave more like Bash and return all results, you can set NOUNIQUE
. NOUNIQUE
has no effect on matching functions such as globmatch
and match
.
For simple patterns, it may make more sense to use EXTGLOB
which will only generate a single pattern which will perform much better: @(ab|ac|ad)
.
Massive Expansion Risk
-
It is important to note that each pattern is crawled separately, so patterns such as
{1..100}
would generate one hundred patterns. In a match function (globmatch
), that would cause a hundred compares, and in a file crawling function (glob
), it would cause the file system to be crawled one hundred times. Sometimes patterns like this are needed, so construct patterns thoughtfully and carefully. -
BRACE
andSPLIT
both expand patterns into multiple patterns. Using these two syntaxes simultaneously can exponential increase duplicate patterns:>>> expand('test@(this{|that,|other})|*.py', BRACE | SPLIT | EXTMATCH) ['test@(this|that)', 'test@(this|other)', '*.py', '*.py']
This effect is reduced as redundant, identical patterns are optimized away1, but when using crawling functions (like in
glob
) andNOUNIQUE
that optimization is removed, and all of those patterns will be crawled. For this reason, especially when using functions likeglob
, it is recommended to use one syntax or the other.
pathlib.SPLIT, pathlib.S
SPLIT
is used to take a string of multiple patterns that are delimited by |
and split them into separate patterns. This is provided to help with some interfaces that might need a way to define multiple patterns in one input. It pairs really well with EXTGLOB
and takes into account sequences ([]
) and extended patterns (*(...)
) and will not parse |
within them. You can also escape the delimiters if needed: \|
.
Duplicate patterns will be discarded1 by default, and glob
and rglob
will return only unique results. If you need glob
or rglob
to behave more like Bash and return all results, you can set NOUNIQUE
. NOUNIQUE
has no effect on matching functions such as globmatch
and match
.
While SPLIT
is not as powerful as BRACE
, it's syntax is very easy to use, and when paired with EXTGLOB
, it feels natural and comes a bit closer. It is also much harder to create massive expansions of patterns with it, except when paired with BRACE
. See BRACE
and its warnings related to pairing it with SPLIT
.
>>> from wcmatch import pathlib
>>> list(pathlib.Path('.').glob('README.md|LICENSE.md', flags=pathlib.SPLIT))
[WindowsPath('README.md'), WindowsPath('LICENSE.md')]
pathlib.NOUNIQUE, pathlib.Q
NOUNIQUE
is used to disable Wildcard Match's unique results return. This mimics Bash's output behavior if that is desired.
>>> from wcmatch import glob
>>> glob.glob('{*,README}.md', flags=glob.BRACE | glob.NOUNIQUE)
['LICENSE.md', 'README.md', 'README.md']
>>> glob.glob('{*,README}.md', flags=glob.BRACE )
['LICENSE.md', 'README.md']
By default, only unique paths are returned in glob
and rglob
. Normally this is what a programmer would want from such a library, so input patterns are reduced to unique patterns1 to reduce excessive matching with redundant patterns and excessive crawls through the file system. Also, as two different patterns that have been fed into glob
may match the same file, the results are also filtered as to not return the duplicates.
Unique results are accomplished by filtering out duplicate patterns and by retaining an internal set of returned files to determine duplicates. The internal set of files is not retained if only a single, inclusive pattern is provided. Exclusive patterns via NEGATE
will not trigger the logic, but singular inclusive patterns that use pattern expansions due to BRACE
or SPLIT
will act as if multiple patterns were provided, and will trigger the duplicate filtering logic. Lastly, if SCANDOTDIR
is enabled, even singular inclusive patterns will trigger duplicate filtering logic to protect against cases where pathlib
will normalize two unique results to be the same path, such as .hidden
and .hidden/.
which get normalized to .hidden
.
NOUNIQUE
disables all of the aforementioned "unique" optimizations, but only for glob
and rglob
. Functions like globmatch
and match
would get no benefit from disabling "unique" optimizations as they only match what they are given.
New in 6.0
"Unique" optimizations were added in 6.0, along with NOUNIQUE
.
pathlib.MATCHBASE, pathlib.X
MATCHBASE
, when a pattern has no slashes in it, will cause all glob related functions to seek for any file anywhere in the tree with a matching basename, or in the case of match
and globmatch
, path whose basename matches. MATCHBASE
is sensitive to files and directories that start with .
and will not match such files and directories if DOTGLOB
is not enabled.
>>> from wcmatch import pathlib
>>> list(pathlib.Path('.').glob('*.txt', flags=pathlib.MATCHBASE))
[WindowsPath('docs/src/dictionary/en-custom.txt'), WindowsPath('docs/src/markdown/_snippets/abbr.txt'), WindowsPath('docs/src/markdown/_snippets/links.txt'), WindowsPath('docs/src/markdown/_snippets/posix.txt'), WindowsPath('docs/src/markdown/_snippets/refs.txt'), WindowsPath('requirements/docs.txt'), WindowsPath('requirements/lint.txt'), WindowsPath('requirements/setup.txt'), WindowsPath('requirements/test.txt'), WindowsPath('requirements/tools.txt'), WindowsPath('site/_snippets/abbr.txt'), WindowsPath('site/_snippets/links.txt'), WindowsPath('site/_snippets/posix.txt'), WindowsPath('site/_snippets/refs.txt')]
pathlib.NODIR, pathlib.O
NODIR
will cause all glob related functions to return only matched files. In the case of PurePath
classes, this may not be possible as those classes do not access the file system, nor will they retain trailing slashes.
>>> from wcmatch import pathlib
>>> list(pathlib.Path('.').glob('*', flags=pathlib.NODIR))
[WindowsPath('appveyor.yml'), WindowsPath('LICENSE.md'), WindowsPath('MANIFEST.in'), WindowsPath('mkdocs.yml'), WindowsPath('README.md'), WindowsPath('setup.cfg'), WindowsPath('setup.py'), WindowsPath('tox.ini')]
>>> list(pathlib.Path('.').glob('*'))
[WindowsPath('appveyor.yml'), WindowsPath('docs'), WindowsPath('LICENSE.md'), WindowsPath('MANIFEST.in'), WindowsPath('mkdocs.yml'), WindowsPath('README.md'), WindowsPath('requirements'), WindowsPath('setup.cfg'), WindowsPath('setup.py'), WindowsPath('site'), WindowsPath('tests'), WindowsPath('tox.ini'), WindowsPath('wcmatch')]