@XPREPLACE (4UTILS) issues?

kwa

Jun 24, 2011
15
0
Here's an example of why imho something has to be done regarding quotes.

The testcase I've been using is this:

Code:
echo %@XREPLACE["^^(.*)two",REPLACE,"one two three one two three"]
That's (imho) an entirely natural way to use XREPLACE. I quoted the third string because it contains spaces. I might also have done that if it contained commas, or if I didn't know what syntax-breaking characters it might contain.

I thought I'd redirect the output to a file:
Code:
echo %@XREPLACE["^^(.*)two",REPLACE,"one two three one two three"] >log.txt
and got this output:

Code:
^(.*)two REPLACE "one two three one two three"
REPLACE three" >log.txt

Ignore the first line - I asked about that in a previous post. The second line showed the redirection clause being ignored and echoed in the output. At first I didn't understand why - I thought it might have been something to do with the reason why the first line appeared.

But then I realised it's because the first quote in "one two three one two three" is treated as input, matched by the (.*), and replaced by REPLACE, so the echo statement resolves to:

echo REPLACE three" >log.txt

Which seems to make tcc treat the redirection clause as normal text to be echoed, despite the fact that there's no closing " so it's really not meaningful syntax.

I'd argue that isn't intuitive behaviour. While it certainly isn't consistently applied throughout tcc's variable functions, there are other places in tcc where double quotes are taken as delimiters not part of the actual text, for example

Code:
echo %@EVAL["1+2"]

returns 3, not an error (because " isn't meaningful in a numeric expression), because the quotes are interpreted as delimiters and not part of the actual data.

I believe XREPLACE should do the same or, if backward compatibility is important, there should be an XREPLACEQ which treats enclosing quotes as delimiters, and makes them mandatory.

A further observation is that quotes on the regex parameter are seemingly treated as delimiters, and discarded, if they are present.

So the following are functionally identical:

Code:
echo %@XREPLACE["^^(.*)two",REPLACE,"one two three one two three"] 
echo %@XREPLACE[^^(.*)two,REPLACE,"one two three one two three"]

So I'd also argue that XREPLACE isn't even consistent with itself as regards how it handles quoted parameters - if they're on the regex they're ignored, if they're on the target string they're literal.

I hope I don't sound like I'm whinging or unappreciative - I'm not. I think 4utils is great and it's great that tcc has a community doing this type of stuff for the benefit of others.

Nor do I claim to be right - as a newbie, it's quite likely I'm wrong in my understanding and/or expectations.
 
May 20, 2008
11,800
118
Syracuse, NY, USA
On Mon, 11 Jul 2011 03:04:11 -0400, kwa <> wrote:

|echo '%@XREPLACE["^^(.*?)two",REPLACE,"one two three one two three"]'

Oops! I didn't take out all the diagnostic stuff. It's fixed; new one
uploaded.
 
May 20, 2008
11,800
118
Syracuse, NY, USA
On Mon, 11 Jul 2011 08:56:20 -0400, kwa <> wrote:

|echo %@XREPLACE["^^(.*)two",REPLACE,"one two three one two three"]
|REPLACE three"

|That's (imho) an entirely natural way to use XREPLACE. I quoted the third string because it contains spaces. I might also have done that if it contained commas, or if I didn't know what syntax-breaking characters it might contain.

It did what you asked. What do you suggest?

The same thing will happen with SED (and I imagine with most implementations of
regex replacement):

Code:
v:\> echo "a b c" | sed s/^\(.*\)b/foo/
foo c"
|
|I thought I'd redirect the output to a file:
|
|Code:
|---------
|echo %@XREPLACE["^^(.*)two",REPLACE,"one two three one two three"] >log.txt
|---------
|and got this output:

That's a TCC anomaly. It's seeing the ">" as being quoted.

Code:
v:\> echo foo" > log.txt
foo" > log.txt[

It's not necessary to quote the target string because of spaces. But
unfortunately, commas will screw things up (because of my adding the [,N]
parameter). I'll think about how to fix that. I don't want to require quoting
the string but that's tricky. In something like

[regex,replace,N,string, with, commas]

where does the string begin? I could require the ",N" parameter (0=all) or go
with the idea of two functions (which is rather easy and inexpensive).
 
May 20, 2008
11,800
118
Syracuse, NY, USA
There are two ways I'll go here.

1. Require 4 parameters (N (all/some), regex, replacement, string). This would require the string to be quoted only if it started with whitespace.

2. Accept but not require the N parameter. This would require the string to be quoted if it contained any whitespace or commas.

And what to do about quoted strings? Since quotes may be required I'll remove them before processing. Should they then be replaced afterward?
 

kwa

Jun 24, 2011
15
0
Hi again, sorry for not replying earlier - been sick.

I'll think and get back. But I want to ask this one more time, because it solves multiple problems.

Is there no way you'll consider a pass-by-reference convention for the 3 string parameters?

Simply put:

a) XREPLACE(..., FRED,...)

The (string) parameter Is interpreted as the literal string FRED. Just as now.

b) XREPLACE(..., @FRED,...)

The (string) parameter Is interpreted as the contents of the environmental variable named FRED (if it exists, else take it literally as above).

Then, with b), it doesn't matter what funny characters (or use of quotes) you might want to use - and when working with regexes you're likely to be using all sorts of them. They never appear on the XREPLACE call line, so they can't cause a problem, XREPLACE doesn't have to worry about what to do with quotes, and nor does a user of it. The text just gets read from the environment, and goes straight to the regex engine.
 
May 20, 2008
11,800
118
Syracuse, NY, USA
On Sat, 16 Jul 2011 05:07:22 -0400, kwa <> wrote:

|b) XREPLACE(..., @FRED,...)
|
|The (string) parameter Is interpreted as the contents of the environmental variable named FRED (if it exists, else take it literally as above).
|
|Then, with b), it doesn't matter what funny characters (or use of quotes) you might want to use. They never appear on the XREPLACE call line, so they can't cause a problem. They just go straight to the regex engine.

In which of the three parameters? It would seem equally useful in both
"replacement" and "string".

I devised a way to handle specifying some vs. all replacements. If regex begins
with N$ (for example 1$) and is otherwise not empty then up to N (1)
replacements will be made. I think it very unlikely that regexes would
otherwise begin with a number followed by end-of-line (followed by more).

What about quotes on "string"? They will only be necessary if "string" begins
with whitespace. Should they be removed before processing? ... replaced
afterwards?
 
May 20, 2008
11,800
118
Syracuse, NY, USA
There's an updated 4UTILS at lucky.syr.edu/4plugins/4utils.zip. The usage of @XREPLACE is now:

@XREPLACE[pattern,replacement,string]

replace all/some (*) occurrences of pattern in string

pattern may be regex
replacement may reference captures \0 - \31 or \{0} - \{31}
\n = newline (CRLF), \t = tab; \\ = backslash
quote pattern/replacement containing whitespace or commas
quote string beginning with whitespace
TCC escape sequences are processed in all three parameters
If a parameter is quoted the quotes are removed
otherwise if it begins with '@' it is treated as a variable name
(*) To limit replacements to N prefix with N$
<replacement>
<replacement>
If an unquoted parameter begins with '@' and the rest of it does not name an existing environment, that parameter is used literally.</replacement></replacement>
 

kwa

Jun 24, 2011
15
0
Great stuff mate - thanks so much for that.

I'll download it and give it a workout this afternoon.
 
May 20, 2008
3,515
4
Elkridge, MD, USA
vefatica wrote:
| On Mon, 11 Jul 2011 03:04:11 -0400, kwa <> wrote:
|
|| echo '%@XREPLACE["^^(.*?)two",REPLACE,"one two three one two three"]'
|
| Oops! I didn't take out all the diagnostic stuff. It's fixed; new one
| uploaded.

That's easily avoidable in new software. Just utilize the standard
preprocessor macro NDEBUG.
--
Steve
 
May 20, 2008
11,800
118
Syracuse, NY, USA
On Mon, 18 Jul 2011 17:12:27 -0400, Steve Fabian <> wrote:

|That's easily avoidable in new software. Just utilize the standard
|preprocessor macro NDEBUG.

I don't think NDEBUG is relevant. I don't use anything affected by it
(assert?).

I wish you a speedy recovery, Steve.
 
May 20, 2008
3,515
4
Elkridge, MD, USA
vefatica wrote:
| On Mon, 18 Jul 2011 17:12:27 -0400, Steve Fabian <> wrote:
|
|| That's easily avoidable in new software. Just utilize the standard
|| preprocessor macro NDEBUG.
|
| I don't think NDEBUG is relevant. I don't use anything affected by it
| (assert?).

Just a #if / #endif bracketing debug code... That allows a quick
recompilation with the flag flipped, all debugging code out.

|
| I wish you a speedy recovery, Steve.

Thanks to you and all others.
--
Steve