Paper documents, either in physical or electronic image formats like faxes, still play an important role in many companies’ business processes in spite of ever-increasing digitization.  These documents contain important information that needs to captured, secured and its contents analyzed for useful information.

Common paper documents that are still being used are invoices, certain government documents like deeds, birth certificates, court documents like pleadings etc.  Converting these documents into electronic formats allows them to be electronically searchable and easier to secure.

Paper documents are captured using a variety of scanning devices from dedicated high-speed scanners to mobile phones with built-in cameras.  After being scanned, the electronic output is processed by a capture solution to improve quality, extract information and secure the content.  For example, the electronic output is cleaned of specks, punch holes and “de-skewed” (straightened).  The capture solution can also use OCR (Optical Character Recognition) to detect text within the image, identify bar codes and even identify signatures and handwritten text.  Based on the information detected in the image, the capture solution can classify and categorize the documents and apply different rules for further processing.  Finally, the electronic output is converted into a secure archival format like PDF and stored in a secure repository like SharePoint or Documentum.

Here is a diagram of a generic document capture solution.

Diagram of a General Document Capture Solution
Diagram of a General Document Capture Solution

Consider applying this concept to a Mortgage Processing application.  As the customer is requested for documents as part of his or her mortgage application, documents may be sent in multiple ways.  Documents may be mailed to the company, which would be scanned by dedicated high-speed scanners.  Certain personal documents like drivers license may be photographed by using the cell phone camera with the company’s mobile application or sent via email to the company.  Documents may also be faxed.

The document capture solution would recognize text or a specific sequence of letters identifying the loan number which would then be used to export it to the correct location in the content management repository.  In case a loan number is not detected, the document will be routed to a human operator for further processing.  The documents can be classified into document classes, e.g. drivers license, tax forms etc. based on the data in the document.  It can also check if a signature is present on a form.  The documents are then converted to a secure format e.g. PDFs and stored in the appropriate location in the Content Management repository.

Two most common document capture solutions are Captiva and Kofax.  Captiva provides many modules related to OCR, Image Recognition and Data Extraction.  It also contains modules used to connect to most of the common Content Management repositories like Documentum and SharePoint.

Converting paper documents into electronic documents has improved the ability to search documents from days to a matter of seconds, especially if the documents are stored in off-site location like Iron Mountain.  It also saves expensive real estate space as paper documents are stored in file rooms.