This article discusses the various ways of information extraction from sales receipt and provides the detailed demonstration of how to use pre-built ML models via Azure Form Recognizer/Azure Cognitive Services.
Introduction
Nowadays, where almost everything is turning to online and virtual modes, a very common problem any organization is facing is the processing of receipts that were scanned and submitted electronically for reimbursement purposes.
Now for any claim or reimbursements to get cleared, first those must reach the proper accounts department based on the organization and the sector, and one way to perform this activity is by manual intervention. A person or a team must go through all those digitally scanned receipts manually and filter them based on the departments or any other validation and eligibility criteria they may have.
The situation becomes more tragic when the volume of such scanned receipts is too high. So, get rid of this manual effort, a lot many organizations have already opted for a solution that is AI-based, and lot many are in the process of doing so.
Definitely, one can go for OCR, which is short for Optical Character Recognization technologies to extract data but here, the problem is not only about data extraction, but it is also about data interpretation. Because there could be an incident, wherein the user uploaded a wrong document altogether, which is not a receipt. So, the solution should be robust enough to filter out these scenarios.
How can AI-based Solutions be Achieved?
Like many other Azure services, here also, we can utilize a service named Form Recognizer, which consists of intelligent processing capabilities and allow us to automate the processing of forms and receipts. Basically, it is a combination of OCR and predictive models, which in turn falls under the umbrella of Azure Cognitive Services.
Here, OCR will work on text extraction and models will help us to filter the useful information, like invoice date, address, amount, description, name or could be any other relevant field, which business demands.
What All Models are Supported by Form Recognizer?
Form Recognizer supports two types of models: Pre-built and Custom models.
- Prebuilt – ones that are provided out-of-box and are already trained with some basic sales data based on USA sales format
- Custom Models – ones that can be tailored based on our needs with our own data and business needs.
So, in this article, I’ll be focusing on the pre-built models and will cover custom model integration as part of another article.
How to Get Started with Form Recognizer?
The very first thing we need is login to the Azure portal at portal.azure.com to create Azure Resource. There are two ways to create Azure resources:
- Using Azure Form Recognizer
- Using Azure Cognitive Services
If anyone is planning to use other services under Cognitive Services, then existing/new resources can be used. But if one needs to work only with Form Recognizer Service, then also it can be done as shown below:
Once Form Recognizer is selected, all the basic details need to be furnished in the below form:
Click on Review + Create and it will create an Azure resource with key and endpoint.
Using the Code
For development, I'm using Python as a language and Visual Studio Code having Jupyter Notebook. Here is the core implementation:
key = "KEY_TO_BE_REPLACED"
endPoint = "ENDPOINT_TO_BE_REPLACED"
import os
from azure.ai.formrecognizer import FormRecognizerClient
from azure.core.credentials import AzureKeyCredential
client = FormRecognizerClient(endpoint = endPoint, credential = AzureKeyCredential(key))
image = "IMAGE_FILE_PATH" fd = open(image, "rb")
analyzeReceipt = client.begin_recognize_receipts(receipt = fd)
result = analyzeReceipt.result()
print('Address: ', result[0].fields.get("MerchantAddress").value)
print('Contact Number: ', result[0].fields.get("MerchantPhoneNumber").value)
print('Receipt Date: ', str(result[0].fields.get("TransactionDate").value))
print('Tax Paid: ', result[0].fields.get("Tax").value)
print('Total Amount Paid: ', result[0].fields.get("Total").value)
items = result[0].fields.items()
for name, field in items: if name=="Items":
for items in field.value:
for item_name, item in items.value.items():
print(item_name, ': ', item.value)
Sample Input and Output
I've taken the below receipt as an input:
and output generated as below:
Summary
This article mentions high-level steps of how one can use a pre-built ML model to read information from a sales receipt, with an assumption that the reader is already aware of how to use Python, VS Code, Jupyter Notebook along with how to import Python modules. But if you are new to any of these, I would recommend you watch the video of this implementation here.
History
- 9th July, 2021: Initial version