1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

TPipe /dup

Discussion in 'Support' started by Charles G, Nov 25, 2015.

Tags:
  1. Charles G

    Joined:
    Apr 2, 2011
    Messages:
    1,043
    Likes Received:
    0
    /dup=Type,MatchCase,StartColumn,Length,IncludeOne,Format

    Remove or show duplicate lines. The arguments are:

    Type:
    0 - Remove duplicate lines
    1 - Show duplicate lines

    MatchCase - If 1, do case-sensitive comparisons

    StartColumn - The starting column for comparisons

    Length - The Length of the comparison

    IncludeOne - Include lines with a count of 1

    Format - how the output should be formatted for Type=1. For example, "%d %s" to show the count followed by the string.

    =========================

    1) Is starting column 0 or 1 based?

    2) Not sure what IncludeOne is used for?

    I am using "Type=0" so Format is not necessary.

    ========================
    TCC 19.00.15 x64 Windows 7 [Version 6.1.7601]
     
  2. Bob Chapman

    Joined:
    May 31, 2008
    Messages:
    55
    Likes Received:
    0
    tpipe gives me the creeps :rolleyes: but FWIW from the online TextPipe manual
    From this I'd infer that
    1. tpipe columns start at 1
    2. If IncludeOne is 1 then tpipe will also output for those lines with no duplicates [remembering that output for any lines is solely determined by the Format you cleverly choose] :smile:
     
  3. Charles G

    Joined:
    Apr 2, 2011
    Messages:
    1,043
    Likes Received:
    0
    @Bob Chapman -

    I am wanting to remove duplicate lines from a file - that contain email address - one per line.

    So for lines that :
    - only occur once - output that line.
    - for lines that occur more then once - output the line only once.

    the following is the BTM so far: (not long ...!)

    Code:
    goto :here
    
    :here
      setlocal
        set fldr=c:\Users\Galloway\Desktop\EMailAddrs\
        rem next set is output file name for SORT /Output...
        set cSrtOut=SortOut.lst
        rem next is work file for email addresses
        set cOut=EmailOut.lst
        if not isdir "%fldr" md /s "%fldr"
        global /h /i /n /q GoSub DoFldr
      endlocal
      quit
    
    :DoFldr
      echo In: %_CWD
      rem before processing current folder
      set nOutSize=%@filesize[%fldr%%cOut]
      rem extract email addresses
      for %fn in (*.eml *.lst *.txt) if ( "%fn" NE "%fldr%%cSrtOut") and ("%fn" NE "%fldr%%cOut") tpipe /input="%fn",0,1 /simple=28 /outputappend=1 /output=%fldr%%cOut
      rem if file has changed = means more email addresses found
      iff nOutSize != %@filesize[%fldr%%cOut] then
        rem make sure file exists
        iff isfile "%fldr%%cOut" then
          rem sort email addresses
          sort /rec 65535 "%%fldr%%cOut" /output=%fldr%%cSrtOut
          rem remove duplicate email addresses
          tpipe /input=%fldr%%cSrtOut /dup=Type,MatchCase,StartColumn,Length,IncludeOne,Format /output=%fldr%%cOut
          rem                                                 0,            0,               0, 65535, ??????????, ?????
        endiff
      endiff
      return
    
     
  4. vefatica

    Joined:
    May 20, 2008
    Messages:
    8,076
    Likes Received:
    30
    Code:
    v:\> type dups.txt
    joe@foo.com
    bob@foo.com
    joe@foo.com
    bob@bar.com
    bob@foo.com
    bob@bar.com
    tom@xyz.edu
    tom@xyz.com
    tom@xyz.edu
    moe@foo.com
    
    v:\> tpipe /input=dups.txt /dup=0,0,1,999,1,""
    joe@foo.com
    bob@foo.com
    bob@bar.com
    tom@xyz.edu
    tom@xyz.com
    moe@foo.com
    
    v:\> tpipe /input=dups.txt /dup=0,0,1,999,0,""
    joe@foo.com
    bob@foo.com
    bob@bar.com
    tom@xyz.edu
    tom@xyz.com
    moe@foo.com
    
    I can't get "Type=1" to output anything.

    (Edit) The format string needs %% (percent signs doubled).
     
    #4 vefatica, Nov 27, 2015
    Last edited: Nov 27, 2015
  5. vefatica

    Joined:
    May 20, 2008
    Messages:
    8,076
    Likes Received:
    30
    I think "Include lines with a count of 1" is for Type=1.
    Code:
    v:\> type dups.txt
    joe@foo.com
    bob@foo.com
    joe@foo.com
    bob@bar.com
    bob@foo.com
    bob@bar.com
    tom@xyz.edu
    tom@xyz.com
    tom@xyz.edu
    moe@foo.com
    
    v:\> tpipe /input=dups.txt /dup=1,0,1,999,1,"%%d %%s"
    2 bob@bar.com
    2 bob@foo.com
    2 joe@foo.com
    1 moe@foo.com
    1 tom@xyz.com
    2 tom@xyz.edu
    
    v:\> tpipe /input=dups.txt /dup=1,0,1,999,0,"%%d %%s"
    2 bob@bar.com
    2 bob@foo.com
    2 joe@foo.com
    2 tom@xyz.edu
     
  6. Charles G

    Joined:
    Apr 2, 2011
    Messages:
    1,043
    Likes Received:
    0
    Seems that an example of specifying /input or /output when they are, for example:

    tpipe /input="%fn" /simple=28 /outputappend=1 /output=%fldr%%cOut
     

Share This Page