Determining compression levels for scanning

In house testing was conducted to determine the best compression settings to use during scanning to produce the best results. Results include readability and OCR suitability using different colors, resolution, and compression.

Scanner and software used for the testing:

  • Scanner - Brother DCP-7065DN LAN
  • Scanner software - Brother CC4 (LEEDTOOLS)
  • Compression software - TIFFCP (VeryPDF.com)

Notes about the testing:

  • G3/G4 compression only works with binary files (one bit per pixel).
  • JPEG compression only works with color files (minor adjustments needed for gray scale files).
  • The scan types are a function of the supplied scanner software. Other software may differ.
  • Complete sample set is available, but only most relevant samples are shown.
  • Gray diffusion and black and white are both binary files.

Conclusions:

  • There is a significant difference in the size of compressed images verse compressed text scans.
  • Black and white scans should be no less than 300x300 for good readability. Black and white scans are good enough for text documents. Even at 300 dpi the size is very small and suitable for OCR. G4 compression is virtually unnoticeable on black and white images.
  • Full color scans are still readable at 150x150, though they may not be suitable for OCR. Color documents may need to be scanned in full color depending on the final use and if there are any photos that need to be captured, though BW will still produce easily readable results for text. . They should be compressed with JPEG 50% or smaller. JPEG 25% shows noticeable compression artifacts, but is still very readable.
  • If possible, multi-page documents should choose a resolution and compression scheme best suited for each page to give the minimum size.
  • This information applies generally in a similar way to both TIFF and PDF files.

Document scanning test results

Please see the attached file to for the scanning compression results.