Scanning and Indexing with Optical Character Recognition

Optical character recognition (OCR) converts images of typewritten or clearly printed text into machine-encoded text which are editable and searchable. This allows any electronic document format to be quickly indexed and conveniently searched.

When scanning paper documents into a PDF or TIFF format, the server side OCR feature in FileHold can automatically OCR those documents to produce a PDF document with a text layer that can be indexed for full text search. Once the documents are in full OCR format, the document management system allows automated full text indexing and searching of the documents. The full text indexing enables users to search for a word or phrase inside the body of a document when its stored in the library.

The paperless office is dependent on document scanning and imaging to create the electronic records management repository. The source of the scanned documents can be dedicated scanners, fax machines or multi function centers (MFC).  In all cases the software allows users to easily import the documents into the systems and correctly capture document types and associated metadata.

Document scanning software with OCR

Every sale of FileHold software ships with a workstation based document scanning software which can supports all TWAIN scanners. This allows users to bring existing paper documents like contracts, mail, invoices, and legal documents into the document management software system. The OCR software that is included in the scanning software converts these scanned images into editable and searchable formats like PDF or Microsoft Word.