Regexes and escape characters

Apr 13, 2010
236
3
58
The Hague
#1
I want to SET a number of envars with a Regex that describes strings that will be passed correctly as arguments to functions and the like.

When passing envars around, to and fro subroutines, unquoting, changing and requoting again, things can go wrong. Sometimes there is quite a "distance" between the point when a mistake caused a malformed string and the moment that the error actually shows itself. By using a SET /T:regex:"..." I hope to catch those mistakes earlier.

I already tested the regex outside of TC and reads as follows in Perl notation.
Code:
^([^\s,"]+|"[\w\s,]+")$
In short it says: the whole string is acceptable if it matches an (unquoted string that contains no space, tab or commas) or (a quoted string). I know improvements are possible but I'll cross that bridge after solving this riddle.

At this point I'm stuck because I can't find the right combination of back-ticking, quoting, escaping and SETDOS expansion to get TC to swallow it.
I keep getting "Parameter error".

I have attached a test-driver and test-cases. Enjoy.

Any insights?

I pray this is not something trivial. :oops:
 

Attachments

#2
With the Perl syntax, this works or at least comes close.
Code:
v:\> echo %@regex["^([^\s\^"\,]+|\".*\")$","oo,oo"]
1

v:\> echo %@regex["^([^\s\^"\,]+|\".*\")$","oo oo"]
1

v:\> echo %@regex["^([^\s\^"\,]+|\".*\")$",oo oo]
0

v:\> echo %@regex["^([^\s\^"\,]+|\".*\")$",oo,oo]
0
v:\> echo %@regex["^([^\s\^"\,]+|\".*\")$",oo"oo]
0

v:\> echo %@regex["^([^\s\^"\,]+|\".*\")$",oooo]
1
A literal double quote (always) needs RE escaping (\), and, in this case, TCC escaping (so it doesn't end @regex's first parameter). I don't know why the comma in the character class needs RE escaping (but it does).
 
Apr 13, 2010
236
3
58
The Hague
#3
I copied your version and adapted my test script, trying to understand.
I still have some questions:
- in the left alternative you tc-esc the quotes, in the right you only rx-esc them. Why is that?
- why do we not need to double the caret at the beginning?

I ended up with
Code:
"^([^\s\^"\,]+|\"[\w\s\,]+\")$"
But the three final cases are matching and they shouldn't.
 

Attachments

Last edited:
#4
They don't match here.
Code:
v:\> echo %@regex["^([^\s^"\,]+|\"[\w\s,]+\")$", "Good, BAD and ugly"]
0

v:\> echo %@regex["^([^\s^"\,]+|\"[\w\s,]+\")$","Good","BAD and ugly"]
0

v:\> echo %@regex["^([^\s^"\,]+|\"[\w\s,]+\")$",""Good, BAD and ugly""]
0
And I think I was wrong earlier. the double quote doesn't need to be RE escaped inside [] (but it doesn't hurt and it does need to be TCC escaped).
 
#5
I think all the test cases look OK.
Code:
v:\> do l in @testcases.txt ( echo %@regex["^([^\s^"\,]+|\"[\w\s,]+\")$",%l] ... %l)
1 ... Valid
0 ... So wrong
0 ... also,flaw
0 ... neater, flaw
1 ... "Ahhh"
1 ... "Some sense"
1 ... "and,sensibility"
1 ... "neater, even"
1 ... "GOOD, bad and ugly"
0 ...  "Good, BAD and ugly"
0 ... ""Good, BAD and ugly""
0 ... "Good","BAD and ugly"
 
Likes: djspits