1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Regular expressions?

Discussion in 'Support' started by vefatica, Dec 26, 2012.

  1. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,883
    Likes Received:
    29
    I suppose "::..\..." means (at least) two characters, then a '.' and then (at least) two characters. But that regular expression never works!
    Code:
    v:\zips> *dir
     
    Volume in drive V is DATA          Serial number is c007:d3e4
    Directory of  V:\zips\*
     
    2012-11-26  23:28        <DIR>    .
    2012-11-26  23:28        <DIR>    ..
    2012-11-26  23:28        <DIR>    Save
    2012-11-26  23:28        <DIR>    shralias_ascii_save_files
    2012-11-26  23:28        <DIR>    X64
    2012-11-25  15:40          38,933  4console.zip
    2012-11-26  15:43          86,596  4threads.zip
    2012-11-26  15:30          61,867  4utils.zip
    2012-11-06  05:41          56,097  sysutils.zip
              243,493 bytes in 4 files and 5 dirs    253,952 bytes allocated
        6,785,912,832 bytes free
     
    v:\zips> *dir "::..\..."
     
    Volume in drive V is DATA          Serial number is c007:d3e4
    TCC: (Sys) The system cannot find the file specified.
    "V:\zips\::..\..\.."
                    0 bytes in 0 files and 0 dirs
    The error message above gives a clue; in it, my regular expression was changed!

    Oddly, "::.\.." (char, dot, char) and "...\...." (3 chars, dot, 3 chars) work.
    Code:
    v:\zips> *dir "::.\.."
     
    Volume in drive V is DATA          Serial number is c007:d3e4
    Directory of  V:\zips\::.\..
     
    2012-11-25  15:40          38,933  4console.zip
    2012-11-26  15:43          86,596  4threads.zip
    2012-11-26  15:30          61,867  4utils.zip
    2012-11-06  05:41          56,097  sysutils.zip
              243,493 bytes in 4 files and 0 dirs    253,952 bytes allocated
        6,785,912,832 bytes free
     
    v:\zips> *dir "::...\...."
     
    Volume in drive V is DATA          Serial number is c007:d3e4
    Directory of  V:\zips\::...\....
     
    2012-11-25  15:40          38,933  4console.zip
    2012-11-26  15:43          86,596  4threads.zip
    2012-11-26  15:30          61,867  4utils.zip
    2012-11-06  05:41          56,097  sysutils.zip
              243,493 bytes in 4 files and 0 dirs    253,952 bytes allocated
        6,785,912,832 bytes free
    
     
  2. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,804
    Likes Received:
    82
    WAD - embedded multiple (more than two consecutive) .'s in a filename are expanded into "extended parent directory names" (and have been for the last 20 years). See the help for details.
     
  3. Steve Fabian

    Joined:
    May 20, 2008
    Messages:
    3,523
    Likes Received:
    4
    But a regex is not a filename... The very useful syntax of extending multiple consecutive periods into extended parent directory names ought not to apply INSIDE a regular expression, where alternate syntax is used... though undoubtedly it would be a tough job for the parser.
     
  4. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,804
    Likes Received:
    82
    In this case, a regex definitely *is* (at least part of) the filename -- and the extended parent directory name expansion is done before the wildcard and/or regular expression parsing.

    Changing that would require rewriting much of the command line parser (several months work at least), and would definitely result in breaking a few million existing batch files and aliases.
     
  5. Steve Fabian

    Joined:
    May 20, 2008
    Messages:
    3,523
    Likes Received:
    4
    Yes, I guessed that.
    The effort required is not surprising. I would quibble about the number of programs the change would effect, but it is irrelevant - the benefits are certainly not worth the effort. However, there ought to be a way to specify that the user wants to select all files with a name of at least 2 characters and an extension of at least 1 character. Well, there is!

    The file match string "*[?][?].[?]*" matches files (or in the appropriate context also directories) with a name of at least 2 characters and an extension of at least 1 character. It does not require using a regex, so it ought to operate faster, too.
     
  6. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,883
    Likes Received:
    29
    How dare you look inside my regular expression! Backslashes and dots are pretty common in regular expressions. IMHO, anything following "::" should be treated as a regular expression. If the user intends a path specification, let him, for example, use "..\subdir\::regex".
     
  7. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,804
    Likes Received:
    82
    Sure. Just requires a few months for a parser rewrite and introducing a few zillion new incompatibilities. :banghead:

    I really don't think that three or more consecutive dots (backslashes are irrelevant) are especially common in RE's. You somehow managed to avoid using them for the past few years ...
     
  8. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,883
    Likes Received:
    29
    Three or more doesn't seem to be a problem. One or two is a problem. Please explain what's happening below, where three or more dots (before the "\.") gives the desired behavior (and one or two doesn't)
    Code:
    v:\test> *dir
    
     Volume in drive V is DATA           Serial number is c007:d3e4
     Directory of  V:\test\*
    
    2012-12-26  22:53         <DIR>    .
    2012-12-26  22:53         <DIR>    ..
    2012-12-26  22:53               0  x.txt
    2012-12-26  22:53               0  xx.txt
    2012-12-26  22:53               0  xxx.txt
    2012-12-26  22:53               0  xxxx.txt
    2012-12-26  22:53               0  xxxxx.txt
    2012-12-26  22:53               0  xxxxxx.txt
    2012-12-26  22:53               0  xxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxxx.txt
                     0 bytes in 10 files and 2 dirs
         6,785,961,984 bytes free
    
    v:\test> *dir "::.\...."
    
     Volume in drive V is DATA           Serial number is c007:d3e4
    TCC: (Sys) The system cannot find the file specified.
     "V:\test\::.\..\..\.."
                     0 bytes in 0 files and 0 dirs
         6,785,961,984 bytes free
    
    v:\test> *dir "::..\...."
    
     Volume in drive V is DATA           Serial number is c007:d3e4
    TCC: (Sys) The system cannot find the file specified.
     "V:\test\::..\..\..\.."
                     0 bytes in 0 files and 0 dirs
         6,785,961,984 bytes free
    
    v:\test> *dir "::...\...."
    
     Volume in drive V is DATA           Serial number is c007:d3e4
     Directory of  V:\test\::...\....
    
    2012-12-26  22:53               0  xxx.txt
    2012-12-26  22:53               0  xxxx.txt
    2012-12-26  22:53               0  xxxxx.txt
    2012-12-26  22:53               0  xxxxxx.txt
    2012-12-26  22:53               0  xxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxxx.txt
                     0 bytes in 8 files and 0 dirs
         6,785,961,984 bytes free
    
    v:\test> *dir "::....\...."
    
     Volume in drive V is DATA           Serial number is c007:d3e4
     Directory of  V:\test\::....\....
    
    2012-12-26  22:53               0  xxxx.txt
    2012-12-26  22:53               0  xxxxx.txt
    2012-12-26  22:53               0  xxxxxx.txt
    2012-12-26  22:53               0  xxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxxx.txt
                     0 bytes in 7 files and 0 dirs
         6,785,961,984 bytes free
    
    v:\test> *dir "::.....\...."
    
     Volume in drive V is DATA           Serial number is c007:d3e4
     Directory of  V:\test\::.....\....
    
    2012-12-26  22:53               0  xxxxx.txt
    2012-12-26  22:53               0  xxxxxx.txt
    2012-12-26  22:53               0  xxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxxx.txt
                     0 bytes in 6 files and 0 dirs
         6,785,961,984 bytes free
    
    v:\test> *dir "::......\...."
    
     Volume in drive V is DATA           Serial number is c007:d3e4
     Directory of  V:\test\::......\....
    
    2012-12-26  22:53               0  xxxxxx.txt
    2012-12-26  22:53               0  xxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxx.txt
    2012-12-26  22:53               0  xxxxxxxxxx.txt
                     0 bytes in 5 files and 0 dirs
         6,785,961,984 bytes free
    
     
  9. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,804
    Likes Received:
    82
    WAD -- the first two are valid extended parent directory names (and are expanded appropriately); the others aren't.
     
  10. JohnQSmith

    Joined:
    Jan 19, 2011
    Messages:
    564
    Likes Received:
    8
    Then it's designed incorrectly. You can't say "this is a regex and it follows the rules is this Oniguruma document" and then turn around and say "well, it doesn't really follow the rules because it's expanding parent paths even though they are in a specifically designated regex."

    Really? Blaming the user again?
     
  11. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,804
    Likes Received:
    82
    The extended parent directory names preceded the regular expression support by about 15 years. There's been a grand total of one reported problem with the parent directory + regular expression syntax thus far; reversing the parsing order would cause thousands of problems. (And extended parent names are used far more often than RE's.)

    Hardly. The original problem was invented & not particularly realistic, and there are existing workarounds that are much more practical than spending a few months rewriting the parser (and breaking everybody *else's* syntax).
     
  12. JohnQSmith

    Joined:
    Jan 19, 2011
    Messages:
    564
    Likes Received:
    8
    I totally understand this. I love extended parent directory names. I find myself in CMD typing something like "cd ..." and then cursing and looking for my jump stick and my portable TC.

    It's a defined regex block, there shouldn't have to be a workaround.

    I would like to offer a request. PLEASE!!! Think about rewriting the parser so that when it sees the "::" syntax to: 1) stop whatever its doing, 2) do a lookahead to find where the regex ends, 3) interpret the regex, 4) return the result to the parser so it can continue.

    Basically, "::this_is_a_regex" so the regular command line parser should keep its grubby mitts off. The parser should look at it and say, "I'm not allowed to touch anything inside of that construct." The regex should have an entirely separate handler. Perhaps it could process the text or list of files or whatever else and then return an array of results back to main parser for continued processing.
     
  13. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,804
    Likes Received:
    82
    I can do that, but it's going to require a major parser rewrite (> 30K lines of code, and several weeks at a minimum). Major parser changes like this tend to have unfortunate side effects of breaking lots of existing code (and CMD compatibility), so I try to avoid them unless there's a compelling reason. IMHO this issue hasn't yet risen to the level that would warrant the effort & pain involved.
     

Share This Page