So today I did the unthinkable and went to the library and scanned a paper. Here is the process of how to clean up the scanned paper to get a better cleaned version. All commands are for linux/Debian:

  1. Convert pdf into pgm files, each per page:
    gs -dBATCH -dNOPAUSE -sDEVICE=pgmraw -r300 -sOutputFile="conv_%03d.pgm" original.pdf
  2. Use unpaper to remove black margins and rotate the pages:
    unpaper conv_%03d.pgm unpaper_%03d.pgm
  3. Put the pgm files back into a pdf file:
    convert *.pgm new.pdf
  4. Compress the resulting pdf file:
    ps2pdf new.pdf newer.pdf

Comments are closed.