Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Regular Expression Issue

Jun
562
4
Rex,

Thanks for fixing the regular-expression syntax for the DIR command. Almost everything is working, including the uppercase expressions \D, \W, etc..

However, I don't understand the following with the POSIX character classes, which behave differently when surrounded by double brackets or single brackets. (I used an alias to save typing.)

TCC(30.00.21): C:\temp\sandbox>zz alpha
--------------------
dirx ::[[:alpha:]]:
--------------------
A
a b
b
--------------------
dirx ::[:alpha:]
--------------------
A
a b

TCC(30.00.21): C:\temp\sandbox>zz lower
--------------------
dirx ::[[:lower:]]:
--------------------
a b
b
--------------------
dirx ::[:lower:]
--------------------
TCC: (Sys) The system cannot find the file specified.
"C:\temp\sandbox\::[:lower:]"

Perhaps the POSIX expression are supposed to always be used in a character set and not individually, but the various versions seem to behave differently.

TCC(30.00.21): C:\temp\sandbox>zz alnum
--------------------
dirx ::[[:alnum:]]:
--------------------
1
2
A
a b
b
--------------------
dirx ::[:alnum:]
--------------------
A
a b
 
Some of those look screwy. Did you test the POSIX regexes without DIR (e.g., with @REGEX) to see if they otherwise work as expected?
 
No, it was too late at night (and I didn't want to stay up all night the way I did last time :-). I was expecting that you regex experts would explain the behavior, most likely telling me that this was working as expected.

I just wrote a script to use @REGEX instead of the DIR command to process the files. I again see a difference between the single and double brackets cases, but the results are not always the same as reported above.

TCC(30.00.21): C:\temp\sandbox>set regex=alpha & batrun zz
---------------------------
Using regex "[[alpha]]"
---------------------------
A
a b
b
---------------------------
Using regex "[alpha]"
---------------------------
a b

TCC(30.00.21): C:\temp\sandbox>set regex=lower & batrun zz
---------------------------
Using regex "[[lower]]"
---------------------------
a b
b
---------------------------
Using regex "[lower]"
---------------------------

TCC(30.00.21): C:\temp\sandbox>set regex=alnum & batrun zz
---------------------------
Using regex "[[alnum]]"
---------------------------
1
2
A
a b
b
---------------------------
Using regex "[alnum]"
---------------------------
a b

Using double brackets always seems to work correctly. The usage outside a character set (i.e., with single brackets) is probably incorrect.

Here is a case using a set of either a digit or a space.

TCC(30.00.21): C:\temp\sandbox>dirx ::"[[:digit:]\s]"
1
2
a b
 
This site (which may not be infallible) seems to agree.

1686758794008.png


1686758839744.png
 
I tried looking at some documentation on regular expressions and did not find anything that covered POSIX beyond a brief mention. From what I did read, the whole subject is so complex that learning it could be a masters degree program!

The regex analyzer built into Take Command does show them with double brackets. It does not flag the expression as invalid with only single brackets, but the text fails.

Anyway, I think that the DIR command is now running very nicely. Thank you, Rex, for making it work.
 

Similar threads

Back
Top