Emerging technologies such as robotic process automation (RPA), artificial intelligence (AI), and analytics are rapidly steering the global economy toward new digital frontiers. But with plenty of information still generated and stored in paper-based, analog formats, companies need tools that can bridge the gap, turning printed text and scanned paper documents into digital files. Optical character recognition (OCR) helps meet this need by automating data extraction and conversion of analog text documents to machine readable text.
OCR technology has a range of uses—including a key role in basic accounts payable (AP) automation. Understanding how it works can help you take full advantage of its capabilities as part of your automation and process optimization goals.
What is OCR?
From capturing business cards to extracting incoming invoices from supplier emails, optical character recognition systems specialize in turning printouts into pixels via pattern recognition and electronic capture of visual information.
With time and training, the algorithms used by OCR programs can accurately extract information from scanned documents or even digital image files (e.g., .pdf, .jpg, and .tiff files) and use it to populate forms, verify information, activate applications, and more.
Putting an OCR system into place is often one of the first steps companies take toward AP automation. OCR invoice processing has long been used to free team members from the tedium of retyping invoice data, and is a key component of more comprehensive automation solutions.
How Does Optical Character Recognition Work?
While different applications often have their own proprietary OCR engine (i.e., algorithm) for document scanning and text recognition, all OCR technology relies on a shared three-step process: image pre-processing; character recognition; and post-processing/cleanup of extracted data.
1. Image Pre-processing
To improve the application’s chances of successfully capturing and converting the desired information, OCR software “pre-processes” image data. This involves removing image noise (e.g., dust, background distortions, etc.) and sharpening the target text for maximum clarity and readability by the software.
This phase also includes normalization, which straightens text by removing any skewing or distortions, and segmentation (also called feature detection), in which the software parses the target areas of the image to identify paragraphs, sentences, and individual characters.
2. Character Recognition
Once pre-processing is complete, the OCR application attempts to recognize and convert the text in the target area using feature extraction.
It accomplishes this in two ways. The first is through direct comparison of captured visual data to existing information in its database. Scanning letter shapes and identifying vectors that match existing patterns in its lexicon—a predefined library of fonts, words, phrases, etc.—the software can recognize a letter “T” and record it as such, while recognizing that this shape is distinct from “Y,” for example.
This approach is most useful for printed text obtained via document scanning or importing a digital text file in a format that isn’t compatible with your existing systems.
The second method involves more advanced character recognition, supported by machine learning algorithms that elevate standard OCR to what is known as intelligent character recognition, or ICR. Combining basic text recognition with advanced pattern recognition, ICR can make intelligent deductions about what parts of an image are characters, words, and sentences, and assign meaning to them based on both its own lexicon and situational context.
To extend our example, basic OCR might be stymied by a handwritten “T” and assign it the wrong value as a “Y.” ICR, however, can parse the characters around the unfamiliar letter and suggest “Tangerine” as a much more likely solution than “Yangerine,” particularly if the document being scanned is for a fruit company with such terms in its lexicon.
This approach is better suited to decorative and unusual fonts, as well as handwritten text. It leverages more advanced technologies such as natural language processing to tackle the much more complex array of shapes produced by handwriting.
3. Post-processing of Extracted Data
Even with help from advanced artificial intelligence, OCR applications need error correction to produce optimal results. Limiting the lexicon to specific expected values can reduce recognition time and errors, especially if OCR is being used to process a high volume of documents containing numbers and codes rather than text.
Other post-processing tools include iterative analysis, which makes multiple passes and compares the results each time to identify the most likely and accurate output, as well as advanced algorithms that parse scanned text to determine context and correct spelling and grammar errors accordingly.
Common OCR Applications
Since its development in the 1970s, OCR technology has helped individuals and businesses capture and interact with analog data in interesting and useful ways. Today, OCR continues to be used in a wide array of applications, including:
- Data management. Companies looking to break free from paper-based documentation and its associated expense, environmental impact, and inefficiency use OCR to digitize existing information and create new workflows that capture and store new information automatically.
- Mobile applications. The ubiquity of cell phones, tablets, and other mobile devices has made it possible to put advanced OCR capabilities in the pockets and purses of folks around the world as features built into mobile apps. A few examples include:
- Personal identification sharing and ID confirmation, e.g. digital-friendly passports and IDs.
- QR codes, used for everything from registering for sweepstakes to sharing links to providing instant discounts on goods and services.
- On-demand scanning and ICR-based text recognition from mobile devices, used for depositing checks, capturing important documents, etc.
- Using a mobile-friendly version of the company’s Cloud Vision AI algorithm, Google Translate for Android blends OCR with augmented reality to let users scan signs, documents, and images to extract and translate text in real time.
- License plate scanning. OCR-powered Automatic License Plate Recognition (ALPR) has proven very useful, and popular, with parking control and law enforcement officials in tracking vehicles and fighting crime.
- Academic assessment. Cambridge Assessment uses OCR to automate and streamline its testing procedures and provide advanced search and analysis tools for reviewers and instructors.
- Process automation. OCR reduces the need for manual data entry and document scanning, cutting labor and resource costs and helping companies streamline their workflows.
OCR in AP Automation
One of the specific ways OCR is used in process automation is in accounts payable. Putting an OCR system into place is often one of the first steps companies take toward AP automation. OCR invoice processing has long been used to free team members from the tedium of retyping invoice data and simplify invoice verification. It’s also often featured as a key component of more comprehensive automation solutions.
It’s easy to see why. Optical character recognition improves speed and accuracy of invoice processing while lowering costs. It provides a versatile toolkit for consolidating data from a wide range of analog and digital sources, and provides a solid foundation for more ambitious digital transformation efforts at the organization level.
On the other hand, OCR’s an aging technology that needs significant support from both human and artificial intelligence to yield optimal results. Standalone OCR solutions may require significant upfront investment, increasing total cost of ownership (TCO) and reducing return on investment (ROI).
They can also require substantial training to tackle invoices streaming in from the multiple systems used by your suppliers—and that doesn’t include the training and resources your staff will need to help the OCR engine reach its full potential. And, as digital transformation continues to reshape modern commerce, the increasingly paperless office may soon leave standalone OCR software standing next to the buggy whips and butter churns.
Ultimately, OCR’s limitations mean it simply cannot provide a complete AP automation solution.
Businesses who want to get the best possible value from their OCR investment should consider it an essential part of the early stages of digital transformation, but also recognize it will eventually be rendered largely obsolete by the process improvements that come with a truly comprehensive automation solution such as PLANERGY —and plan accordingly.
OCR Helps You Capture Savings and Value as Well as Text
Used effectively, optical character recognition can provide substantial benefits to any business, and definitely has a part to play in boosting efficiency and performance. Once you understand its capabilities and limitations, you can put it to work in your own workflows and make it part of your digital transformation plans. With careful planning and a proactive approach, you can cut costs, save time, and help your company move confidently toward a fully digital future.