non-ASCII files

Discussion in 'T&T - Scripting' started by epement, May 8, 2012.

  1. epement

    Jun 28, 2008
    Likes Received:
    I don't know if this batch file interests anyone else, but I've found it pretty useful. Here is the internal help. Note: it uses sed and "od" (octal dump), so it's not a "pure" Take Command batch file. Here is the embedded help:

    NONASC.BTM v1.43 -
      Look for non-printable chars in file or input. Valid chars include 20h-7E,
      TAB, CR, and LF. Control codes or graphic chars are considered "non-ASCII".
      The input file is not changed. Exit code 0 if pure ASCII, or 1 otherwise.
    Usage: nonasc [-switches] [filename]
      Only 1 filename allowed. If filename is omitted, read from stdin. If no
      switches are used, look for non-printing characters and display the first
      2 lines of non-printing chars, if any. Hits are shown in reverse video.
    Switches (may appear in any order):
      -d    Dump the input file in both hex and ASCII to the screen without
            checking for invalid characters. Works on pure binary files.
            Long files should be piped through a file pager (e.g., "less").
            If the -{n} switch is used, limit the output to {n} lines.
      -{n}  Display {n} lines of input, where {n} is an integer (default: 2)
            If -d is omitted, display {n} lines of invalid input.
            If -d is used, display {n} lines beginning at top of file.
            Numeric switch must be separate from other options! Thus,
            "-d -5" and "-7 -t" are valid, but "-d5" or "-7t" are not.
      -t    Include TAB (0x09) as invalid char.  -t and -d are incompatible.
            If both are present, -t will be ignored.
      -r    Include CR (0x0D) as invalid char.
      -z    Allow Ctrl-Z (0x0A), but only at EOF.
      -?, --help      Display this help message
    Oh, I almost forgot. Sample input and output:
    [0]> nonasc nonasc.btm
    "nonasc.btm" contains only printable ASCII chars.
    [0]> nonasc \windows\system32\edit.com
    "\windows\system32\edit.com" contains control codes or graphics characters!
    od -An -tx1 -a "\windows\system32\edit.com": (octdump -no_offset -hexcodes -ascii)
      4d  5a  fe  00  89  00  00  00  cd  01  fa  04  ff  ff  58  0f
       M   Z   ~ nul  ht nul nul nul   M soh   z eot del del   X  si
      80  00  00  00  10  00  2c  0f  1e  00  00  00  01  00  00  00
     nul nul nul nul dle nul   ,  si  rs nul nul nul soh nul nul nul
    If there is any interest in this, I'll post the source.

    Eric P.

