Effective Capture is the First Step in Digital Transformation

Effective Capture

HARVEY SPENCER ASSOCIATES

_____________________________________________________________

Effective Capture is the First Step in Digital Transformation

By Harvey Spencer, President HSA inc.

? 2017 HSA Inc.

Page 1 of 12

Effective Capture

Background Digital transformation with SAP is not just a buzzword. In order to effectively compete in a global fast moving economy, all businesses have to transform their methods and processes. The promise is better competitiveness, happier customers, fast efficient processes and analytic tools for better management. But change is challenging as we convert from legacy systems to a digital economy. Receiving and capturing information from paper, email attachments, fax and electronic transmissions continue to present challenges to all organizations. Still this is often ignored as companies focus on streamlining the processes of electronic data and on new applications. The problem is that without a solution to automate the conversion and understanding of unstructured and semi-structured inputs including legacy ones, manual data entry becomes a huge business burden. This is not only a cost at the front end, but missing or late information leads to exponential cost, risk, and bad decisions downstream.

Digital Transformation promises to eliminate these paper-oriented inputs. And although declining in use as an overall percentage of transactions, the fact is that paper and paper-formatted documents continue to be a major method of transactional communications between businesses. Sometimes despite the acceptance of digital signatures, this is because a `wet signature' is still required on the paper (see Certificate of Insurance below). But even when the physical side of paper is eliminated through PDF's and digital signatures delivered via email ? the problems of processing remain.

The numbers are large and processing is expensive. Worldwide we believe that around 50bn transactions a year need to be converted ? a total of around 6tn characters to be captured. As one data point, the US mail shipped over 61bn pieces of first class mail in 2016 ? much of it is business related. While this has been declining, it is worth noting that the rate of decline has been slowing.

Depending on the application and country, 80% or more of external communications is still paper based. With an average manual data entry cost of 63 cents, the cost is enormous.

In the case of invoices for example, we estimate that over 5bn sheets of paper are sent out every year in the US from companies with over $5m revenue and there will still be over 4bn by 2023. Including

? 2017 HSA Inc.

Page 2 of 12

Effective Capture

PDF invoices, which some call EDI, then the volumes are much higher, but the processing challenges remain.

Most of this information is manually keyed. An average invoice form contains around 120 characters of information that must be captured into the SAP system. At an average wage rate in western countries of $30,000 a year, that costs around $37,500 a year to capture the information from just 50,000 invoices ? 4,000 a month. So in the US alone, we are likely spending over $4bn a year keying information from invoices. Adding Europe with a similar sized economy likely doubles that with the rest of the world additive.

But when we look at all the different departments and areas within a typical company using SAP, many hundreds of thousands of forms and other information must be key entered at a cost of several hundred thousand dollars a year per company. Paper formats are used in Purchasing; in Orders; in Contracts; in Shipping; in Human Resources in Sales and Marketing and in many other areas. In Banking and Insurance they are used for on-boarding new customers and loans or for filing claims. Much of this is manually keyed. Not only is this expensive, it introduces risk and it is time consuming.

To solve the problem companies such as Open Text have developed advanced capture technologies to dramatically reduce the costs of extracting, validating and updating the SAP core systems, make it faster, more accurate and provide the ability to capture additional information ? effectively without cost.

Speed of processing is important and is one of the benefits of going digital, but if legacy input systems cannot keep up, the advantage is lost and more nimble companies will overtake. Accuracy is critical as systems built to use in memory real-time systems such as S/4 HANA do not give you time to correct mistakes before the process ? processes are likely to have already been put in place and decisions made. Undoing these becomes expensive and adds to risk.

Reducing Cost & Faster Processing

These systems are designed to reduce the cost of capturing information from incoming forms on a broader basis by leveraging advanced Optical Character

? 2017 HSA Inc.

Page 3 of 12

Effective Capture

Recognition (OCR) and other recognition capability combined with AI based machine learning systems. Utilizing identified keywords to locate the needed information, business and validation rules associated with that information are then applied. These are then combined with a learning system to continuously understand what the variations in the form are, and learn what clerical staff do in order to build the automated extraction and validation systems. In the case of Open Text, this all operates within the user's SAP systems, whether on-premise or in the cloud as far as the user is concerned.

The approach uses OCR to convert the whole document set to text. Taking the illustration below of a Certificate of Liability Insurance, the system ignores `boilerplate text' and looks for `key words' such as `Insurer' and `Insured' as well as the tabular amounts insured for and the signature. It finds the check off boxes such as type of Insurance and liability. In the case of an invoice it looks for the key information including `Invoice Number', `PO Number' and `Date' etc.

As you may see the Certificate of Insurance example was scanned from a printed FAX. The information can be difficult to read and the paper may have

? 2017 HSA Inc.

Page 4 of 12

Effective Capture

been angled or distorted. Today's image enhancement and improved OCR deals with most issues ? if it is legible to a human the machine can read it and in some cases the machine can outperform a human. But in the case of FAX's, problems caused by the sending machine can increase if you receive a fax on paper, print it and then scan it. Sending directly from a computer system and receiving on a fax server or converting to a PDF and delivering with an email helps substantially because lines are straight and characters are well formed and cleanly separated.

The system then uses key words to locate the information or box next to that information and extract it, tagging it usually with XML tags that can easily be imported. This was previously impractical due to speed performance constraints. The system also identifies other elements such tabular `line item' information where the details such as coverage or in the case of the invoice ? quantity, product ID, cost and extension is laid out.

Set up time for new document types becomes greatly reduced from previous methods and when a new format or document type is encountered, the system may be able to automatically adjust if the form is similar in layout, language and/or terms to a `known' previously captured form. But if it is a new type of form there is a need to adjust the proposed zones and add rules. New capability leverages previous knowledge and combines that with the operator's activities and knowledge to automatically set up and refine the extraction and validation process.

One challenge has been extraction of tabled information such as line items on an invoice. Line item extraction is not easy as there are many variables. A simple case is one item equals one line ? typically a quantity, description, unit amount and extended amount (see figure below).

In this case it is fairly easy to extract each line from the table, balance to the extended amounts and the totals.

But invoices and other forms frequently are not as simple as this. In the case of an invoice, item descriptions may take up multiple lines which vary according to the descriptions. They may even carry over from one page to another with a break point wherever the output system decided to put it. Totals and where and how discounts have been calculated as well as taxes have to be

? 2017 HSA Inc.

Page 5 of 12

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download