- Jun
- 762
- 16
Well, my starting to play with regular expressions has certainly led to discovering a lot of bugs. Here's another one.
The help entry called "Wildcards and Regular Expressions" shows how to use regular expressions for file name matching, such as with the DIR command. The discussion includes the following examples.
The second example illustrates what to do if the regular expression contains any special characters, such as whitespace, redirection characters, or escape characters.
Unfortunately, it turns out that regular expressions actually do not work properly with the DIR command (and PDIR and maybe elsewhere). This post will illustrate the problems.
The examples use a DIR alias to keep the display simple.
And I created a sandbox area with a few files.
It appears that with the DIR command, the entire command tail is converted to lower case. As a result, the regular expressions that use uppercase letters do not work. For example, \d is supposed to select digit characters and \D is supposed to select the opposite, non-digit characters. However, both act like the lowercase version.
This is not a regular-expression problem as illustrated by a similar command that uses @REGEX.
If the command line is converted to lower case, we see the same erroneous result as with the DIR command.
Regular expressions support the following character classes, among others:
[:digit:]
[:upper:]
[:lower:]
And they work with @REGEX.
But they do not work with DIR.
The help entry called "Wildcards and Regular Expressions" shows how to use regular expressions for file name matching, such as with the DIR command. The discussion includes the following examples.
dir ::ca[td]
dir "::^\w{1,8}\.btm$"
The second example illustrates what to do if the regular expression contains any special characters, such as whitespace, redirection characters, or escape characters.
Unfortunately, it turns out that regular expressions actually do not work properly with the DIR command (and PDIR and maybe elsewhere). This post will illustrate the problems.
The examples use a DIR alias to keep the display simple.
TCC(30.00.18): C:\temp\sandbox>alias dirx
*dir /h /k /m /b
And I created a sandbox area with a few files.
TCC(30.00.18): C:\temp\sandbox>dirx
1
2
A
a b
B
Conversion to Lower Case
It appears that with the DIR command, the entire command tail is converted to lower case. As a result, the regular expressions that use uppercase letters do not work. For example, \d is supposed to select digit characters and \D is supposed to select the opposite, non-digit characters. However, both act like the lowercase version.
TCC(30.00.18): C:\temp\sandbox>dirx ::\d
1
2
TCC(30.00.18): C:\temp\sandbox>dirx ::\D
1
2
This is not a regular-expression problem as illustrated by a similar command that uses @REGEX.
TCC(30.00.18): C:\temp\sandbox>for %file in (*) do if %@regex[\D,%file] EQ 1 echo Matching file: %file
Matching file: A
Matching file: a b
Matching file: B
If the command line is converted to lower case, we see the same erroneous result as with the DIR command.
TCC(30.00.18): C:\temp\sandbox>for %file in (*) do if %@regex[\d,%file] eq 1 echo Matching file: %file
Matching file: 1
Matching file: 2
Failure to Recognize POSIX Bracket Syntax
Regular expressions support the following character classes, among others:
[:digit:]
[:upper:]
[:lower:]
And they work with @REGEX.
TCC(30.00.18): C:\temp\sandbox>for %file in (*) do if %@regex["[[:lower:]]",%file] EQ 1 echo Matching file: %file
Matching file: a b
TCC(30.00.18): C:\temp\sandbox>for %file in (*) do if %@regex["[[:upper:]]",%file] EQ 1 echo Matching file: %file
Matching file: A
Matching file: B
TCC(30.00.18): C:\temp\sandbox>for %file in (*) do if %@regex["[[:digit:]]",%file] EQ 1 echo Matching file: %file
Matching file: 1
Matching file: 2
But they do not work with DIR.
TCC(30.00.18): C:\temp\sandbox>dir "::[[:lower:]]"
Volume in drive C is Windows Serial number is a879:820d
TCC: (Sys) The system cannot find the file specified.
"C:\temp\sandbox\::[[:lower:]]"
0 bytes in 0 files and 0 dirs