[OS X TeX] OT: Tool for Comparing PDF files ?
Steffen Wolfrum
osxtex_2 at st.estfiles.de
Thu Apr 5 10:49:20 EDT 2007
Hi,
I have to admit that I am still not familiar with command_line/UNIX stuff.
That's the reason why I didn't try Michael's script up-to now.
But Axel's enthusiastic comment made me curious and so I am considering to dive into it.
I just don't know where to start: Which package (pnmarith/pnmarith???) do I need now, and where do I get it?
Steffen
On Thu, 8 Mar 2007 04:17:07 -0600, Axel E. Retif wrote:
> On Mar 2, 2007, at 09:41, Michael Sternberg wrote:
>
>> Hello,
>>
>> On Mar 1, 2007, at 9:34 , Steffen Wolfrum wrote:
>>> Does someone know a tool for comparing PDF documents?
>>>
>>> Now and then I make small changes in the source files and would
>>> feel saver if I'd had a tool that would show me when a resulting
>>> PDF has / has not differences (to a PDF that was made before I made
>>> the changes) …
>>>
>>
>> Try the script tacked-on below. It does a graphical diff:
>>
>> diffps -h
>> diffps fileA.pdf fileB.pdf
>>
>> You need the netpbm[plus] package and Ghostscript.
>
> Your shell script is wonderful! Thank you. I tried it with two
> identical PDFs, and it reported nothing; then I changed (with
> Acrobat) just 1 letter in one of the two 224-page long PDFs, and it
> found the difference.
>
> Just one thing, though ---it calls pnmarith instead of the new
> pamarith. According to
>
> http://netpbm.sourceforge.net/doc/pnmarith.html
>
> pnmarith is obsolete. (And Gerben Wierda's Netpbm i-package comes
> with the new pamarith, not pnmarith.)
>
> Thank you again,
>
> Axel
>
>> By default, it uses "xv" ("Preview" on MacOS) to display differing
>> pages. Use "-x foo" to specify another viewer, which must read ppm
>> and png files.
>>
>> For repeated uses, it uses a page-cache, which you can override with
>> -f and clean with -c.
>>
>>
>> Regards, Michael
>> ------------------------------------------------------------
>> #!/bin/bash
>> # compare pages in two similar ps-files by highlighting their differences
>> # (uses grayscale pixmaps for comparing)
>> #
>> # Usage: (see -h)
>> #
>> # Created by Michael Sternberg, 2001-2007. Use at your own risk.
>>
>> PROGRAM=`basename $0`
>>
>> CACHE=.diffps
>> PAGES="*"
>> RES=72
>> VIEWER="xv -nolimit -24"
>> PAIR_FILE=pairs
>> VIEWS=1
>> HIST_THRESHOLD=1
>>
>> case `uname` in
>> Darwin) VIEWER="open -a Preview" ;;
>> esac
>>
>> Usage () {
>> cat << EOT
>> Compare postscript/pdf files visually.
>> Usage: $PROGRAM [options] file1 [file2 | dir]
>>
>> If file2 is not given, the latest version from CVS is used.
>>
>> Options:
>> Page rendering:
>> -d directory
>> directory for page cache (default: "$CACHE")
>>
>> -p pages
>> view only the given pages (quoted shell glob pattern)
>> (default: "$PAGES")
>>
>> -t threshold
>> minimum number of pixels to differ (default: $HIST_THRESHOLD)
>>
>> -r res Resolution for pixmap rendering (default: $RES)
>>
>> -f re-do comparison (force; discard cache)
>>
>> Viewing:
>> -0 report only
>> -1 view differing pages in diff-mode (red = recent; default)
>> -2 view differing pages pairwise
>> -3 both of the above
>> -x viewer specify image viewer for above (default: xv)
>>
>> General:
>> -h This help.
>> -c clean cache
>>
>> Created by Michael Sternberg, 2001-2007. Use at your own risk.
>> EOT
>> exit
>> }
>>
>> Clean_Cache () {
>> case $CACHE in
>> */*) echo $CACHE: not a subdirectory -- please clean manually. 1>&2
>> exit ;;
>> esac
>> rm -rf $CACHE # better know what you're doing
>> }
>>
>>
>> # parse options
>> while :
>> do
>> case "$1" in
>> -d) CACHE=$2; shift 2 ;;
>> -p) PAGES=$2; shift 2 ;;
>> -r) RES=$2; shift 2 ;;
>> -f) FORCE=1; shift ;;
>> -t) HIST_THRESHOLD=$2; shift 2 ;;
>>
>> -0) VIEWS=0; shift ;;
>> -1) VIEWS=1; shift ;;
>> -2) VIEWS=2; shift ;;
>> -3) VIEWS=3; shift ;;
>> -x) VIEWER=$2; shift 2 ;;
>>
>> -c) CLEAN=1; shift ;;
>> -h) Usage ;;
>>
>> -*) echo $0: unknown option 1>&2
>> Usage
>> exit 1 ;;
>> *) break ;;
>> esac
>> done
>>
>> # clean cache. Exit if this is the only task.
>> if [ -n "$CLEAN" ]; then
>> Clean_Cache
>> case $# in
>> 0) exit ;;
>> esac
>> fi
>>
>> # attempt to create cache dir
>> mkdir $CACHE 2> /dev/null
>>
>> A_PS="$1"
>> B_PS="${2-$CACHE}"
>> [ -d "$B_PS" ] && B_PS="$B_PS/$A_PS"
>>
>> case $# in
>> 2) ;;
>> 1) # get older copy from CVS
>> cvs up -p "$A_PS" > "$B_PS" || exit
>> # swap A and B to have named file as B, i.e., newer copy
>> X="$B_PS"; B_PS="$A_PS"; A_PS="$X"
>> ;;
>> *) echo Invalid input. 1>&2
>> Usage
>> exit 1
>> ;;
>> esac
>>
>> A_BASE="${A_PS//\//_}"
>> B_BASE="${B_PS//\//_}"
>>
>> # convert to pixmap format; use cache when available and not outdated
>> if [ ! -f $CACHE/"$A_BASE"-001.pgm \
>> -o "$A_PS" -nt $CACHE/"$A_BASE"-001.pgm \
>> -o -n "$FORCE" \
>> ]
>> then
>> gs -dNOPAUSE -sDEVICE=pgmraw -r$RES
>> -sOutputFile=$CACHE/"$A_BASE"-%03d.pgm \
>> "$A_PS" quit.ps || exit
>> fi
>>
>> if [ ! -f $CACHE/"$B_BASE"-001.pgm \
>> -o "$B_PS" -nt $CACHE/"$B_BASE"-001.pgm \
>> -o -n "$FORCE" \
>> ]
>> then
>> gs -dNOPAUSE -sDEVICE=pgmraw -r$RES
>> -sOutputFile=$CACHE/"$B_BASE"-%03d.pgm \
>> "$B_PS" quit.ps || exit
>> fi
>>
>> # compare pages
>> OWD=`pwd`
>> cd $CACHE
>> rm -f $PAIR_FILE 2> /dev/null
>> for A_PGM in "$A_BASE"-${PAGES}.pgm
>> do
>> SUFFIX="${A_PGM//*-/}"
>> N=${SUFFIX/.pgm/}
>>
>> B_PGM="$B_BASE-${SUFFIX}"
>>
>> H_DAT="$A_BASE-$B_BASE-${N}-hist.dat"
>> V="$A_BASE-$B_BASE-${N}-view.png"
>> D="$A_BASE-$B_BASE-${N}-diff.png"
>>
>> if [ ! -f "$H_DAT" -o -n "$FORCE" ]; then
>> # get histogram of diffs
>> pnmarith -diff "$A_PGM" "$B_PGM" | tee "$D".pgm | pgmhist > "$H_DAT"
>> fi
>>
>> ## Sample histogram:
>> # value count b% w%
>> # ----- ----- -- --
>> # 0 484690 100% 100%
>> # 255 14 100% 0.00289%
>>
>> # count non-black pixels
>> H_COUNT=`awk 'NR>3 { sum += $2} END {print 1*sum}' "$H_DAT"`
>>
>> # assemble views of differing pages (only)
>> if [ $H_COUNT -ge $HIST_THRESHOLD ]; then
>> echo $N differ 1>&2
>> if [ ! -f "$V" -o -n "$FORCE" ]; then
>> rgb3toppm "$A_PGM" "$B_PGM" "$B_PGM" \
>> | pnmtopng -transparent white -background grey50 > "$V"
>> pnmtopng "$D".pgm > "$D"
>> fi
>> echo "$V" "$A_PGM" "$B_PGM" >> $PAIR_FILE
>> fi
>> rm -f "$D".pgm 2> /dev/null
>>
>> ## When memory is tight -- This renders options "-2" and "-3" useless.
>> #if [ -z "$VIEWS" ]; then
>> # rm "$A_PGM" "$B_PGM
>> #fi
>> done
>>
>> # decide which images to view
>> case $VIEWS in
>> 1) COLS=1 ;; # diff-view only
>> 2) COLS=2-3 ;; # page pairs only
>> 3) COLS=1-3 ;; # all
>> *) exit ;;
>> esac
>>
>> # see if xargs supports the flag -r --no-run-if-empty
>> xargs -r < /dev/null 2> /dev/null && XARGS_ARGS="-r"
>>
>> if [ -f $PAIR_FILE ]; then
>> cut -f$COLS -d' ' $PAIR_FILE | xargs $XARGS_ARGS $VIEWER
>> fi
>>
>> # EOF
>>
>>
>> ------------------------- Helpful Info -------------------------
>> Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
>> TeX FAQ: http://www.tex.ac.uk/faq
>> List Archive: http://tug.org/pipermail/macostex-archives/
>> List Reminders & Etiquette: http://www.esm.psu.edu/mac-tex/list/
>>
>>
>>
>
>
> ------------------------- Helpful Info -------------------------
> Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
> TeX FAQ: http://www.tex.ac.uk/faq
> List Archive: http://tug.org/pipermail/macostex-archives/
> List Reminders & Etiquette: http://www.esm.psu.edu/mac-tex/list/
>
>
------------------------- Helpful Info -------------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
TeX FAQ: http://www.tex.ac.uk/faq
List Archive: http://tug.org/pipermail/macostex-archives/
List Reminders & Etiquette: http://www.esm.psu.edu/mac-tex/list/
More information about the MacOSX-TeX
mailing list