Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

FOR vs DO for counted loops

May
238
2
A bit of a beginner problem perhaps, but something I noticed myself very recently.

It seems that for counted loops (n1 to n2) with a loop variable (e.g. i) that will be used inside the loop that FOR is preferable to DO, even though it's mostly included for CMD.EXE compatibility as per the help.

The reason is that FOR seems to support using the same loop variable name in nested loops while behaving sensibly (i.e. changes to the loop variable in the inner loop does not affect the loop variable used in the outer loop). This is a very desirable property for the sake of code composability and maintenance (the outer loop might for instance call a GOSUB routine and it's not always obvious it will contain an inner loop).

Do I have this right, or did I maybe miss some important detail somewhere?

Also another question, in CMD.EXE batch files the FOR loop variables must have two percent signs while it makes do with only one on the command line. TCC on the other hand seems to work fine with only one percent sign also in batch files. Is this something you can count on or does it just happen to work? Two percent signs work as well with TCC.

Test batch file to illustrate my points regarding nested loops with same loop variable names.

Code:
@echo off

echo.
ECHO FOR /L
echo.

for /l %val in (1,1,2) do (
   echo Outer1 %val
   for /l %val in (1,1,2) do (
     echo `  `Inner %val
   )
   echo Outer2 %val
)

echo.
ECHO DO
echo.

do i = 1 to 2
   echo Outer1 %i%
   do i = 1 to 2
     echo `  `Inner %i%
   enddo
   echo Outer2 %i%
enddo

echo.
ECHO _do_loop testing
echo.

do 3
   echo Outer1 %_do_loop
   do 3
     echo `  `Inner %_do_loop
   enddo
   echo Outer2 %_do_loop
enddo
 
The behavior of "DO i=m to n" seems correct (there is only one variable i) and useful. When the loop is finished, you can use the value of i determine if the loop continued to the end or was terminated early.
Code:
do i=1 to 3
   echo %i
   if %i == 2 leave
enddo
if %i != 4 echo DO loop terminated early!

It's reasonable to expect the user to use different variable names when nesting such loops.

I could make the same argument for _do_loop since it outlives the do loop.
Code:
v:\> do 2 ( echo foo ) & echo %_do_loop
foo
foo
3

And I could sympathize with the contrary argument that the value of _do_loop should depend on the DO nesting level. Maybe we need _do_level[n]

If you prefer FOR, use it. I'm glad to be rid of it.
 
It seems that for counted loops (n1 to n2) with a loop variable (e.g. i) that will be used inside the loop that FOR is preferable to DO, even though it's mostly included for CMD.EXE compatibility as per the help.

The reason is that FOR seems to support using the same loop variable name in nested loops while behaving sensibly (i.e. changes to the loop variable in the inner loop does not affect the loop variable used in the outer loop). This is a very desirable property for the sake of code composability and maintenance (the outer loop might for instance call a GOSUB routine and it's not always obvious it will contain an inner loop).

Do I have this right, or did I maybe miss some important detail somewhere?

IMHO, based on half a century of coding programs, using the same variable name for an outer and an inner loop is a major cause of confusion, and a very bad practice, to be avoided at all costs, especially if the inner loop is not surrounded by a SETLOCAL - ENDLOCAL pair. BTW, it is clear that with DO loops instead of FOR loops using the same variable name in the inner loop overwrites the one in the outer loop, so unintentiional infinite loops and other major mistakes are very easy to create by reusing control variable names...

Also another question, in CMD.EXE batch files the FOR loop variables must have two percent signs while it makes do with only one on the command line. TCC on the other hand seems to work fine with only one percent sign also in batch files. Is this something you can count on or does it just happen to work? Two percent signs work as well with TCC.
WAD. You can depend on it. See the FOR command's FORMAT in HELP.

Code:
v:\> do 2 ( echo foo ) & echo %_do_loop
foo
foo
3
IMHO the value of _do_loop at the end is WRONG! From HELP -> DO:
%_do_loop The number of times the DO loop has been executed

It is obvious from the number of time foo is printed the loop was executed twice, not thrice. I am posting a separate thread for this issue.

And ... the value of _do_loop should depend on the DO nesting level. Maybe we need _do_level[n]
A pseudoarray of _do_loop[n], to be created only by executing LEAVE n, where n>1, is indeed desirable.
 
The behavior of "DO i=m to n" seems correct (there is only one variable i) and useful. When the loop is finished, you can use the value of i determine if the loop continued to the end or was terminated early.

Oh, I'm not complaining about the behaviour of DO here, just illustrating how it works differently from FOR.

It's reasonable to expect the user to use different variable names when nesting such loops.

I agree, and I would do so if the loops where lexically nested in the batch file. When I stumbled upon this little problem (and wasted some time debugging it) I had a a loop which happened to indirectly contain another loop which was used inside an ALIAS so I was not immediately aware of it. (The code was not written that way originally, but ended up so when I added new features and moved things around)

If you prefer FOR, use it. I'm glad to be rid of it.

It does not look as nice, true, but since its loop variable is contained inside the FOR construct and does not seem to mess with the environment before or after it seems much safer for use in ALIAS:es and sub-routines (in case you don't wrap them in SETLOCAL/ENDLOCAL pairs).
 
Last edited:
IMHO, based on half a century of coding programs, using the same variable name for an outer and an inner loop is a major cause of confusion, and a very bad practice, to be avoided at all costs, especially if the inner loop is not surrounded by a SETLOCAL - ENDLOCAL pair. BTW, it is clear that with DO loops instead of FOR loops using the same variable name in the inner loop overwrites the one in the outer loop, so unintentiional infinite loops and other major mistakes are very easy to create by reusing control variable names...

I agree, the documentation makes it very clear how DO is supposed to work.

Just an easy trap to fall into if you're more used to high-level languages (and prefers "i" as loop variable name!) where the loop variable is contained inside the loop. Luckily FOR (/L) works the same way in TCC as well it seems.

WAD. You can depend on it. See the FOR command's FORMAT in HELP.

Thanks, just wanted to make sure.

By mistake I also used FOR without any percent signs at all as well (e.g. "for /L i in (1,1,10) do echo %i"), which also seemed to work at first, but had some not so nice side effects for (GOSUB) variables whose name started with "i".

Maybe the parser should/could give a syntax error in that case?
 
Last edited:
The reason is that FOR seems to support using the same loop variable name in nested loops while behaving sensibly (i.e. changes to the loop variable in the inner loop does not affect the loop variable used in the outer loop). This is a very desirable property for the sake of code composability and maintenance (the outer loop might for instance call a GOSUB routine and it's not always obvious it will contain an inner loop).

Do I have this right, or did I maybe miss some important detail somewhere?

If memory serves, this is only true for single-character variable names. If the variable name is more than one letter long, FOR will use an environment variable just like DO does.
 
Am I missing something? My CMD can't deal with varnames longer than one character!

Code:
v:\> type inout.bat
@echo off

for /L %%v in (1,1,2) do echo %%v

for /L %%val in (1,1,2) do echo %%val


v:\> inout.bat
1
2
%val was unexpected at this time.
v:\>
 
If memory serves, this is only true for single-character variable names. If the variable name is more than one letter long, FOR will use an environment variable just like DO does.

I did a quick test to check.

Code:
set val=test
echo val = %val
for /l %val in (1,1,2) do (
   echo %val
)
echo val = %val

Seem to work ok, "val = test" is printed both before and after the FOR loop.

But it seems there might be something to using single-character variable names.

Code:
set ivar=test
for /l %i in (1,1,2) do (
   echo i = %i
   echo ivar = %ivar
   echo ivarbothsides = %ivar%
)

The output of this code snippet is a little surprising!

Code:
i = 1
ivar = 1var
ivarbothsides = test
i = 2
ivar = 2var
ivarbothsides = test

If you name the loop variable "ir" instead (not a starting substring of ivar), then it works as you want.



One more observation:

If a variable you use in the for loop is not defined, having %-signs on both sides of it does not protect it from being messed with by the loop variable (in case the loop variable is a starting substring).

Luckily it seems using the %[varnamehere] notation instead will protect also against this case.

Code:
for /l %i in (1,1,2) do (
   echo i = %i
   echo ivar = %ivar ERROR!
   echo ivarboth = %ivar%ERROR!
   echo ivarboth = %ivar% ERROR!
   echo ivarSafe = %[ivar]Ok.
)

==>

i = 1
ivar = 1var ERROR!
ivarboth = 1var!
ivarboth = 1var ERROR!
ivarSafe = Ok.
i = 2
ivar = 2var ERROR!
ivarboth = 2var!
ivarboth = 2var ERROR!
ivarSafe = Ok.

So I guess it might be a good idea to always use %[] notation to make your batch code as robust as possible? Or might there be some downsides? Besides taking longer to type.

Or you could avoid FOR and use DO instead, which does not handle loop variables in this way! ;)

But then you lose the nice property of non-environment polluting loop variables. Trade-offs, trade-offs, ...
 
Last edited:
Back
Top