1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Speed question

Discussion in 'Support' started by vefatica, Aug 26, 2010.

  1. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,939
    Likes Received:
    30
    I'm testing a pair of plugin functions. The command VARNAMES outputs the names of the environment variables, one per line. The variable function @VARNAMES[] returns an =-separated list of the environment variable names. The two use nearly identical code, the only difference being

    Code:
    if ( !bUseRegex || (rx = regex_match(szRegex, szVarName)) > 0 )
        Printf(L"%s\r\n", szVarName);
    for VARNAMES, vs.

    Code:
    if ( !bUseRegex || (rx = regex_match(szRegex, szVarName)) > 0 )
        psz += Sprintf(psz, L"%s=", szVarName);
    for @VARNAMES.

    I'd expect them to be nearly equally fast. But they're not even close (see below). Each will accept a regular expression on which to match variable names but below, none is specified, so regex_match() is not called.

    Code:
    v:\> timer & for /l %i in (1,1,10000) (varnames > nul) & timer
    Timer 1 on: 19:51:00
    Timer 1 off: 19:51:14  Elapsed: 0:00:13.14
    
    v:\> timer & for /l %i in (1,1,10000) (echo %@varnames[] > nul) & timer
    Timer 1 on: 19:51:15
    Timer 1 off: 19:51:18  Elapsed: 0:00:02.94
    Rex, can you guess what accounts for the huge difference in times? I have no complaints; I'm just very curious.
     
  2. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,854
    Likes Received:
    83
    It's always a lot faster to display one big string rather than a bunch of
    little ones. (STDOUT is not buffered, and you've got a lot more API
    overhead.)

    Rex Conn
    JP Software
     
  3. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,939
    Likes Received:
    30
    Thanks. I hadn't thought of that (and if I had, probably wouldn't have realized it was significant). But a little test exemplifies it:

    Code:
    v:\> timer & for /l %i in (1,1,5000) ((echo a & echo b & echo c & echo d) > NUL) & timer
    Timer 1 on: 22:20:48
    Timer 1 off: 22:20:52  Elapsed: 0:00:03.86
    
    v:\> timer & for /l %i in (1,1,5000) (echo a^r^nb^r^nc^r^nd > NUL) & timer
    Timer 1 on: 22:20:58
    Timer 1 off: 22:20:59  Elapsed: 0:00:01.25
     
  4. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,939
    Likes Received:
    30
    So I added simple buffering to the command version (VARNAMES).

    Code:
    if ( bList ) // @VARNAMES
        r += Sprintf(r, L"%s=", szVarName);
    else        // VARNAMES command
    {
        r += Sprintf(r, L"%s\r\n", szVarName);
        if ( r-psz >= 1024 )
        {
            Printf(L"%s", psz);
            *psz = 0;
            r = psz;
        }
    }
    The command went from about 1/4 the speed of the variable function to **twice** the speed. That (above) is pretty much the difference between the two (except for the ECHO needed to output the variable function version).

    So now I must ask if there's that much more overhead involved in calling the variable function compared to calling the command.m They're both under .0005 seconds so I'm not complaining.

    I also noticed that I can "ECHO @VARNAMES[]" when there's over 24000 characters in the string (varname1=varname2=...). What are the limits these days?
     
  5. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,854
    Likes Received:
    83
    Yes, there's that much more overhead in the (very, very complex) variable
    substitution.


    32K for an input line, 64K expanded.

    Rex Conn
    JP Software
     
  6. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,939
    Likes Received:
    30
    On Fri, 27 Aug 2010 22:17:43 -0400, rconn <>
    wrote:

    |---Quote---
    |> I also noticed that I can "ECHO @VARNAMES[]" when there's over 24000
    |> characters in the string (varname1=varname2=...). What are the limits
    |> these days?
    |---End Quote---
    |32K for an input line, 64K expanded.

    I see. And the size of any one token in the expanded line doesn't
    seem to matter. You might up the 8191 limit on @REPEAT, then.
     
  7. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,854
    Likes Received:
    83
    Not really feasible in 32-bit Windows, as it increases the stack usage
    dramatically.

    Rex Conn
    JP Software
     
  8. Steve Fabian

    Joined:
    May 20, 2008
    Messages:
    3,520
    Likes Received:
    4
    | Code:
    | ---------
    | if ( bList ) // @VARNAMES
    | r += Sprintf(r, L"%s=", szVarName);
    | else // VARNAMES command
    | {
    | r += Sprintf(r, L"%s\r\n", szVarName);
    | if ( r-psz >= 1024 )
    | {
    | Printf(L"%s", psz);
    | *psz = 0;
    | r = psz;
    | }
    | }
    | ---------

    Since you are so interested in speed, did you consider other speed-up
    methods, e.g., eliminating all the format-string parsing and
    pseudoformatting done by Sprintf and instead using memcpy, strcpy, or their
    Unicode equivalent to build your output string, and to precalculate the
    upper limit of your buffer so you don't need to add 1024 each time? BTW, you
    WILL overflow a 1024-element buffer if the total is more than 1024 elements,
    because you test AFTER appending; resetting the first element of the buffer
    to NUL is not neeeded, either - the next szVarName will overwrite it,
    anyway.
    --
    Steve
     
  9. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,939
    Likes Received:
    30
    On Sat, 28 Aug 2010 12:25:33 -0400, Steve Fábián
    <> wrote:

    || Code:
    || ---------
    || if ( bList ) // @VARNAMES
    || r += Sprintf(r, L"%s=", szVarName);
    || else // VARNAMES command
    || {
    || r += Sprintf(r, L"%s\r\n", szVarName);
    || if ( r-psz >= 1024 )
    || {
    || Printf(L"%s", psz);
    || *psz = 0;
    || r = psz;
    || }
    || }
    || ---------
    |
    | Since you are so interested in speed, did you consider other speed-up
    |methods, e.g., eliminating all the format-string parsing and
    |pseudoformatting done by Sprintf and instead using memcpy, strcpy, or their
    |Unicode equivalent to build your output string, and to precalculate the
    |upper limit of your buffer so you don't need to add 1024 each time? BTW, you
    |WILL overflow a 1024-element buffer if the total is more than 1024 elements,
    |because you test AFTER appending; resetting the first element of the buffer
    |to NUL is not neeeded, either - the next szVarName will overwrite it,
    |anyway.

    The buffer (psz) is far bigger than 1024; it's the buffer passed to my
    function (as in "INT WINAPI VARNAMES (WCHAR *psz));

    I suspect Sprintf() is quite fast; after seeing "%s" it knows it's
    copying a nul-terminated string and probably does something akin to
    the code below (which is what memcpy, wcscpy, et c. would do). With
    the method you suggested, I'd have to (separately) get the length of
    szVarName to properly increment r. So instead of Sprintf, I tried
    this (probably about as fast as it'll get, and adding 32 bytes to the
    ".text" segment and making anyone reading the code have to think a
    little more).

    Code:
    WCHAR *z = szVarName;
    while ( *z )
    	*r++ = *z++;
    *r++ = L'\r';
    *r++ = L'\n';
    *r = 0;
    The difference:

    Code:
    v:\> timer & for /l %i in (1,1,10000) (varnames > nul) & timer
    Timer 1 on: 13:33:57
    Timer 1 off: 13:34:00  Elapsed: 0:00:02.80
    
    v:\> timer & for /l %i in (1,1,10000) (varnames > nul) & timer
    Timer 1 on: 13:38:06
    Timer 1 off: 13:38:09  Elapsed: 0:00:02.80
    That's why sprintf, strcpy, memcpy (and friends and relatives) exist
    ... to prevent continual re-invention of the wheel. I think the
    overhead in such functions is small compared to the job they do.
     

Share This Page