Tuesday, February 9, 2010

OCR and Scanners

In the process of converting scanned documents to searchable text, there are many different requirements.  The most important is to insure you have the best quality.  OCR Software can only give you good results if it is handed the best quality document.  Below are some tips to make sure you get the best results:

  • Set your scanner at 300 DPI - this is the sweet spot for OCR, and most engines are tuned for this resolution.  Upping the DPI will give you nothing but slower processing and larger file sizes ; )
  • If you have image processing, use it!  Deskew, despeckle, etc will help the overall quality of the image and give you better results.  Advanced Document Capture applications, will give you a full suite of image processing tools to insure the highest possible quality.
  • Black and White - many engine now support color, but you will get the best quality through a black and white image.
These are just a few tips, more to come.