@clip peculiarity

Discussion in 'Support' started by Steve Fabian, Jul 10, 2010.

  1. Steve Fabian

    Message Count:
    2,752
    Character sequence "CR CR LF" is treated by @CLIP as 2 EOLs. All other
    operations which deal with lines (@execarray, @fileseekl, @fileread, @line,
    ffind) consider it as 1 EOL. IMHO it would be desirable for @CLIP to be
    modified to behave identically to other operations.
    --
    Steve
  2. vefatica

    Message Count:
    3,983
    On Fri, 09 Jul 2010 22:53:29 -0400, Steve Fábián
    <> wrote:

    |Character sequence "CR CR LF" is treated by @CLIP as 2 EOLs.

    They are? How so?

    Code:
    v:\> echo foo^r^r^nbar > clip:
    
    v:\> echo %@clip[0]
    foo
    
    v:\> echo %@clip[1]
    ECHO is OFF
    
    v:\> echo %@clip[2]
    bar
  3. vefatica

    Message Count:
    3,983
    On Fri, 09 Jul 2010 23:25:25 -0400, vefatica <>
    wrote:

    |On Fri, 09 Jul 2010 22:53:29 -0400, Steve Fábián
    |<> wrote:
    |
    ||Character sequence "CR CR LF" is treated by @CLIP as 2 EOLs.
    |
    |They are? How so?
    |
    |
    |Code:
    |---------
    |v:\> echo foo^r^r^nbar > clip:
    |
    |v:\> echo %@clip[0]
    |foo
    |
    |v:\> echo %@clip[1]
    |ECHO is OFF
    |
    |v:\> echo %@clip[2]
    |bar
    |---------

    And they really made it to the clipboard. "LIST /X clip:" produces:

    Code:
    0000 0000 66 6f 6f 0d 0d 0a 62 61  72 0d 0a foo...bar..
  4. Steve Fabian

    Message Count:
    2,752
    | Steve wrote:
    ||
    || Character sequence "CR CR LF" is treated by @CLIP as 2 EOLs.
    ||
    | They are? How so?
    |
    |
    | Code:
    | ---------
    | v:\> echo foo^r^r^nbar > clip:
    |
    | v:\> echo %@clip[0]
    | foo
    |
    | v:\> echo %@clip[1]
    | ECHO is OFF
    |
    | v:\> echo %@clip[2]
    | bar
    | ---------

    You have just proved my statement: there is an extra, blank line in CLIP:.
    Try to put the same in a file, and use TYPE to display it - there won't be a
    blank line.
    --
    Steve
  5. vefatica

    Message Count:
    3,983
    On Fri, 09 Jul 2010 23:45:08 -0400, Steve Fábián
    <> wrote:

    || Steve wrote:
    |||
    ||| Character sequence "CR CR LF" is treated by @CLIP as 2 EOLs.
    |||
    || They are? How so?
    ||
    ||
    || Code:
    || ---------
    || v:\> echo foo^r^r^nbar > clip:
    ||
    || v:\> echo %@clip[0]
    || foo
    ||
    || v:\> echo %@clip[1]
    || ECHO is OFF
    ||
    || v:\> echo %@clip[2]
    || bar
    || ---------
    |
    |You have just proved my statement: there is an extra, blank line in CLIP:.
    |Try to put the same in a file, and use TYPE to display it - there won't be a
    |blank line.

    Yes, you're right. When the same data is in a file ...

    Code:
    v:\> list /x clip.txt
    0000 0000 66 6f 6f 0d 0d 0a 62 61  72 0d 0a  foo...bar..
    
    v:\> echo %@line[clip.txt,0]
    foo
    
    v:\> echo %@line[clip.txt,1]
    bar
  6. rconn Administrator

    Message Count:
    5,853
    That's a gibberish line ending -- TCC will recognize files with CR, CR/LF,
    or LF line endings. Something like CR/CR/LF cannot be interpreted.
    Different APIs are going to return different results, depending on how much
    of the text they scan.

    You can either clean up your input or wait for the DWIM parser!

    Rex Conn
    JP Software
  7. vefatica

    Message Count:
    3,983
    On Sat, 10 Jul 2010 12:35:05 -0400, rconn <>
    wrote:

    |That's a gibberish line ending -- TCC will recognize files with CR, CR/LF,
    |or LF line endings. Something like CR/CR/LF cannot be interpreted.
    |Different APIs are going to return different results, depending on how much
    |of the text they scan.

    |You can either clean up your input or wait for the DWIM parser!

    I wouldn't call it gibberish. It's just redundant (and easily
    interpreted as meaning CRLF). Many console apps do it. Blaming
    Microsoft doesn't help us deal with it.

    The user should expect @LINE[file,N], @line[clip:,N] and @CLIP[N] to
    agree when the clipboard and the file contain exactly the same data
    ... don't you think? We may not have control over what gets
    redirected to a file or to the clipboard.
  8. rconn Administrator

    Message Count:
    5,853
    No, I do not.

    What if you have CR/CR/LF/CR/CR/CR? Is the *real* line ending a CR/LF, or
    is it CR with a random LF thrown in?

    The only way to handle this consistently would be to forbid all line endings
    except CR/LF; I think that's a bit draconian to handle the one instance in
    20 years where somebody's complained about @CLIP's GIGO.

    Rex Conn
    JP Software
  9. Steve Fabian

    Message Count:
    2,752
    | ---Quote---
    || Character sequence "CR CR LF" is treated by @CLIP as 2 EOLs. All
    || other operations which deal with lines (@execarray, @fileseekl,
    || @fileread, @line,
    || ffind) consider it as 1 EOL. IMHO it would be desirable for @CLIP
    || to be modified to behave identically to other operations.
    | ---End Quote---
    | That's a gibberish line ending -- TCC will recognize files with CR,
    | CR/LF, or LF line endings. Something like CR/CR/LF cannot be
    | interpreted. Different APIs are going to return different results,
    | depending on how much of the text they scan.
    |
    | You can either clean up your input or wait for the DWIM parser!

    In the specific instant that triggered my OP it is the output of MS'
    ping.exe (WinXP SP3 version). I process my input with TCC, so it must do the
    cleaning up. It does it nicely by using @EXECARRAY[pingreport, ping %url]. I
    originally planned to use the clipboard, hence the report.

    Too bad that the test DEFINED ping[%n] (where ping is an array, and n is
    numeric) is always FALSE, whether or not the specific array element has been
    initialized to a value other than an empty string. Could this be changed in
    a future version?
    --
    Steve
  10. vefatica

    Message Count:
    3,983
    On Sat, 10 Jul 2010 13:17:45 -0400, rconn <>
    wrote:

    |---Quote---
    |> The user should expect @LINE[file,N], @line[clip:,N] and @CLIP[N] to
    |> agree when the clipboard and the file contain exactly the same data
    |> ... don't you think?
    |---End Quote---
    |No, I do not.
    |
    |What if you have CR/CR/LF/CR/CR/CR? Is the *real* line ending a CR/LF, or
    |is it CR with a random LF thrown in?

    Perhaps I'm old-fashioned, but CR (0x0D) means "move to the beginning
    of the current line"; it does not mean "go to the next line" and
    doesn't give a new line. A CR when you're already at the beginning of
    a line is merely redundant (and as such, poor programming though
    perhaps not entirely the programmer's fault). So I'd consider your
    exteme example above as simply CRLF. That's what you **see** in a
    console.
  11. Steve Fabian

    Message Count:
    2,752
    | What if you have CR/CR/LF/CR/CR/CR? Is the *real* line ending a
    | CR/LF, or is it CR with a random LF thrown in?
    |
    | The only way to handle this consistently would be to forbid all line
    | endings except CR/LF; I think that's a bit draconian to handle the
    | one instance in 20 years where somebody's complained about @CLIP's
    | GIGO.

    IMHO the best way to handle it is not to consider CR as EOL. This would
    be consistent with its purpose in ASCII as a "format effector" moving the
    cursor to the beginning of the current line, allowing overprinting (as does
    BS). Only the LF, FF, and VT characters put you into a different line.
    Furthermore, technically none of those mean column change, hence the CR/LF
    sequence. There is also a now obsolete technical reason why the order is CR
    LF, not LF CR. AFAIK no system other than the Trash-80 (oops, I meant
    TRS-80) ever abused CR to mean EOL.
    --
    Steve
  12. rconn Administrator

    Message Count:
    5,853
    DEFINED refers to environment variables; array variables are not in the
    environment. I don't think it's a good idea to widen the scope of DEFINED,
    particularly when there's a dozen other existing ways to do it.

    Rex Conn
    JP Software
  13. Charles Dye Super Moderator

    Message Count:
    2,160
    (cough) PETSCII (cough)
  14. Steve Fabian

    Message Count:
    2,752
    | ---Quote---
    || Too bad that the test DEFINED ping[%n] (where ping is an array, and
    || n is numeric) is always FALSE, whether or not the specific array
    || element has been initialized to a value other than an empty string.
    || Could this be changed in a future version?
    | ---End Quote---
    | DEFINED refers to environment variables; array variables are not in
    | the environment. I don't think it's a good idea to widen the scope
    | of DEFINED, particularly when there's a dozen other existing ways to
    | do it.

    Looks like a duck, walks like a duck ... I did not refer to checking
    whether or not an array variable is defined, I was referring to a single
    element of the array, which can be set and its value used just like an
    ordinary environment variable. The only test I found to check whether or not
    a specific array element is defined is to check its length.
    I cannot see a reason not to expand what parameters are acceptable for
    the DEFINED status test to include array elements, and indeed even to
    internal variables.
    --
    Steve
  15. drrob106

    Message Count:
    36
    Macs use cr as eol


    Sent from my Verizon Wireless Phone

    ----- Reply message -----
    From: "Steve F�bi�" <>
    Date: Sat, Jul 10, 2010 2:03 pm
    Subject: [Support-t-2150] @clip peculiarity
    To: <rob@drrob1.com>

    | What if you have CR/CR/LF/CR/CR/CR? Is the *real* line ending a
    | CR/LF, or is it CR with a random LF thrown in?
    |
    | The only way to handle this consistently would be to forbid all line
    | endings except CR/LF; I think that's a bit draconian to handle the
    | one instance in 20 years where somebody's complained about @CLIP's
    | GIGO.

    IMHO the best way to handle it is not to consider CR as EOL. This would
    be consistent with its purpose in ASCII as a "format effector" moving the
    cursor to the beginning of the current line, allowing overprinting (as does
    BS). Only the LF, FF, and VT characters put you into a different line.
    Furthermore, technically none of those mean column change, hence the CR/LF
    sequence. There is also a now obsolete technical reason why the order is CR
    LF, not LF CR. AFAIK no system other than the Trash-80 (oops, I meant
    TRS-80) ever abused CR to mean EOL.
    --
    Steve
  16. vefatica

    Message Count:
    3,983
    On Sat, 10 Jul 2010 14:45:41 -0400, Steve Fábián
    <> wrote:

    | I cannot see a reason not to expand what parameters are acceptable for
    |the DEFINED status test to include array elements, and indeed even to
    |internal variables.

    I don't think you'd get what you want. Once you say "SETARRAY n[5]",
    the individual elements are "defined". If you want to see if an
    element has any meaningful value, use "n" NE "".
  17. rconn Administrator

    Message Count:
    5,853
    Macs use CR as EOL.

    Rex Conn
    JP Software
  18. rconn Administrator

    Message Count:
    5,853
    Adding internal variables seems faintly ridiculous -- why not just test "if
    1==1"? In what case would an internal variable *not* be defined?

    I am not going to change DEFINED at this point; use one of the many existing
    alternatives. And aren't you the same guy who gets crazed when anything is
    changed affecting backwards compatibility? :)

    Rex Conn
    JP Software
  19. Steve Fabian

    Message Count:
    2,752
    | Once you say "SETARRAY n[5]", the individual elements are "defined".

    Compare with environment variables. All possible variable names are
    always declared IMPLICITLY, and their values accessible without ever
    initializing them. If not previously initialized, the value as a string is
    the empty string, and as a numeric value it is zero. For example, the
    following is perfectly operable code:
    UNSET Z
    SET /A Z+=1
    In the same manner your command above makes %n[0] ... %n[4] accessible,
    and %n[5] still inaccessible. You need to use @execarray, @filearray, or SET
    to actually initialize the individual array elements just as if they were
    independent environment variables.
    When you initialize an array's elements using the @filearray function,
    its value tells you how many elements you actually initialized.
    Unfortunately @execarray does not provide that information, nor does it load
    the uninitialized elements with the equivalent of **EOC** as @CLIP[] reports
    or with **EOF** as @fileread[] does. In fact you have to know the output
    format of the command A PRIORI to locate the end of data. This is why I
    requested a change in @execarray.
    When the array is initialized using @filearray or @execarray, some
    elements may be initialized to empty strings, corresponding to blank lines.
    When you process a file using "for %x in (@file) ..." or its DO equivalent,
    the test "if defined x" is the simple test for a blank line. I was looking
    for a similarly simple test when the file (or command output) is put into an
    array. Your test "%n[%i]" NE "" is logically equivalent to what I used to do
    for environment variables before DEFINED was available, %@len[%n[%i]] GT 0,
    but I am sure both are slower than DEFINED n[%i] would be, though I am not
    sure whether your test or mine is faster, but DEFINED is definitely simpler
    to understand.
    --
    Steve
  20. rconn Administrator

    Message Count:
    5,853
    Two corrections -- first, you can get the size info with @ARRAYINFO, and
    second, there *are* no uninitialized elements. @EXECARRAY allocates only as
    much as it needs, and initializes everything.

    Rex Conn
    JP Software

Share This Page