The purpose of this article is to teach you how to perform OCR using C++ by interfacing with an OCR SDK.
Introduction
During our day to day development, we needed to perform OCR (Optical Character Recognition) on scanned images, screenshots and other forms or files. We looked for an SDK which will allow that and examined ABBYY Cloud OCR SDK. They didn't have any C / C++ code samples, so I had to develop them...
Creating an ABBYY Cloud OCR App
Here are the steps that need to be taken once a trial account was created and verified.
- Create a new App.
- Check your email for the Application's password. You will need both Application ID and Application Password to start.
You should see these two placeholder lines in our code:
#define APP_ID "<Your Application ID>"
#define PASSWORD "<Your Application Password"
Replace these strings with your allocated Application ID and Application Password.
Initiating libCurl
First, we initiate the Curl object:
curl_global_init(CURL_GLOBAL_ALL);
curl = curl_easy_init();
Handling the Input File
Our input file can be any image or even a PDF file. We need to convert it to a mime and to do so, we call:
<a href="https://curl.se/libcurl/c/curl_mime_init.html">curl_mime_init()</a>
Next, we generate the upload part in our request:
field = curl_mime_addpart(form);
curl_mime_name(field, "upload");
and generate the file data using curl_mine_filedata() which is used to set our mime part's body data from out input file's contents.
curl_mime_filedata(field, file_to_upload.c_str());
Now, we set the options by calling curl_easy_setopt(), which, as its name implies, prepare the set the options for our request.
We need the following attributes:
PROCESSING_URL
is the URL given by Abbyy SDK. headerlist
was set earlier. form
is the upload part. APP_ID
is an application specific identifying provided by Abbyy SDK per each software we develop. PASSWORD
is the application's password, which needs to be generated. CURLOPT_WRITEFUNCTION
is a callback function for writing the result of the request. Data is written into readBuffer
which will hold the results we receive from the API.
curl_easy_setopt(curl, CURLOPT_URL, PROCESSING_URL);
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headerlist);
curl_easy_setopt(curl, CURLOPT_MIMEPOST, form);
curl_easy_setopt(curl, CURLOPT_USERNAME, APP_ID);
curl_easy_setopt(curl, CURLOPT_PASSWORD, PASSWORD);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CurlWrite_CallbackFunc_StdString);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
Then, we are ready to execute our request. We call curl_easy_perform().
res = curl_easy_perform(curl);
Right after we check the results.
Checking the Results
Next, we obtain the Task_ID
which is given to each OCR task. You can initiate several tasks and then wait for each of them to be completed. We use the Task_ID
we have obtained to check the status of the task and wait until it is completed.
while (1)
{
res = curl_easy_perform(status_curl);
if (res != CURLE_OK)
{
WriteLogFile(L"Error: %S", curl_easy_strerror(res));
}
else
{
WriteLogFile(L"Read Buffer:\n%S", readBuffer.c_str());
task_status = ObtainStatus(readBuffer);
WriteLogFile(L"task_status: %S", task_status.c_str());
}
if (task_status != "Completed")
{
Sleep(2000);
}
else
{
setcolor(LOG_COLOR_DARKGREEN, 0);
SetConsoleTitle(L"OCR completed");
setcolor(LOG_COLOR_WHITE, 0);
result_url = ObtainURL(readBuffer);
result_url = ReplaceAll(result_url, "&", "&");
WriteLogFile(L"Downloading File from URL: %S", result_url.c_str());
op_curl = curl_easy_init();
if (op_curl)
{
headerlist = curl_slist_append(headerlist, buf);
curl_easy_setopt(op_curl, CURLOPT_URL, result_url.c_str());
curl_easy_setopt(op_curl, CURLOPT_HTTPHEADER, headerlist);
curl_easy_setopt(op_curl, CURLOPT_HTTPGET, 1L);
FILE* wfd = fopen(json_result_file.c_str(), "ab");
fprintf(wfd, "\n");
curl_easy_setopt(op_curl, CURLOPT_WRITEDATA, wfd);
curl_easy_perform(op_curl);
curl_easy_cleanup(op_curl);
fclose(wfd);
WriteLogFile(L"FILE saved");
}
break;
}
readBuffer = "";
}
Now once we have the results, we just clean up everything.
Here is a video demo of how the program works:
Building Blocks
One of the key building blocks to such project, would be libCurl. I used it as a static
library. The .lib file is included in the article's source code, however you can read about using libCurl
as a static
library here.
Note: WriteLogFile()
is one of my old logging functions described in this article.
Using the Code
You can use a different export format. See this link for the options.
You can define which languages you are expecting. Read this link for the options.
You can use many languages, most of them can be also as hand written text (ICR). You set the list of expected languages in the PROCESSING_URL string
.
In this example, we expect English and Hebrew:
#define PROCESSING_URL
"https://cloud-westus.ocrsdk.com/processImage?exportFormat=txt&language=English,Hebrew"
History
- 30th January, 2021: Initial version