OCR using C++

Michael Haephrati

4.90/5 (8 votes)

30 Jan 2021CPOL3 min read

15.7K

418

How to use an OCR SDK using C++ and libCurl

The purpose of this article is to teach you how to perform OCR using C++ by interfacing with an OCR SDK.

Introduction

During our day to day development, we needed to perform OCR (Optical Character Recognition) on scanned images, screenshots and other forms or files. We looked for an SDK which will allow that and examined ABBYY Cloud OCR SDK. They didn't have any C / C++ code samples, so I had to develop them...

Creating an ABBYY Cloud OCR App

Here are the steps that need to be taken once a trial account was created and verified.

Create a new App.
Check your email for the Application's password. You will need both Application ID and Application Password to start.

You should see these two placeholder lines in our code:

C++

#define APP_ID            "<Your Application ID>"
#define PASSWORD        "<Your Application Password"

Replace these strings with your allocated Application ID and Application Password.

Initiating libCurl

First, we initiate the Curl object:

C++

curl_global_init(CURL_GLOBAL_ALL);
curl = curl_easy_init();

Handling the Input File

Our input file can be any image or even a PDF file. We need to convert it to a mime and to do so, we call:

C++

<a href="https://curl.se/libcurl/c/curl_mime_init.html">curl_mime_init()</a>

Next, we generate the upload part in our request:

C++

field = curl_mime_addpart(form);
curl_mime_name(field, "upload");

and generate the file data using curl_mine_filedata() which is used to set our mime part's body data from out input file's contents.

C++

curl_mime_filedata(field, file_to_upload.c_str());

Now, we set the options by calling curl_easy_setopt(), which, as its name implies, prepare the set the options for our request.

We need the following attributes:

PROCESSING_URL is the URL given by Abbyy SDK.
headerlist was set earlier.
form is the upload part.
APP_ID is an application specific identifying provided by Abbyy SDK per each software we develop.
PASSWORD is the application's password, which needs to be generated.
CURLOPT_WRITEFUNCTION is a callback function for writing the result of the request. Data is written into readBuffer which will hold the results we receive from the API.

C++

curl_easy_setopt(curl, CURLOPT_URL, PROCESSING_URL);
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headerlist);
curl_easy_setopt(curl, CURLOPT_MIMEPOST, form);
curl_easy_setopt(curl, CURLOPT_USERNAME, APP_ID);
curl_easy_setopt(curl, CURLOPT_PASSWORD, PASSWORD);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CurlWrite_CallbackFunc_StdString);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);

Then, we are ready to execute our request. We call curl_easy_perform().

C++

res = curl_easy_perform(curl);

Right after we check the results.

Checking the Results

Next, we obtain the Task_ID which is given to each OCR task. You can initiate several tasks and then wait for each of them to be completed. We use the Task_ID we have obtained to check the status of the task and wait until it is completed.

C++

while (1)
{
    res = curl_easy_perform(status_curl);
    if (res != CURLE_OK)
    {
        WriteLogFile(L"Error: %S", curl_easy_strerror(res));
    }
    else
    {
        WriteLogFile(L"Read Buffer:\n%S", readBuffer.c_str());
        task_status = ObtainStatus(readBuffer);
        WriteLogFile(L"task_status: %S", task_status.c_str());
    }
    if (task_status != "Completed")
    {
        //wait 5s before next check
        Sleep(2000);
    }
    else
    {
        setcolor(LOG_COLOR_DARKGREEN, 0);
        SetConsoleTitle(L"OCR completed");
        setcolor(LOG_COLOR_WHITE, 0);

        result_url = ObtainURL(readBuffer);
        //replace all &amp; to &
        result_url = ReplaceAll(result_url, "&amp;", "&");
        //downloading text file of response
        WriteLogFile(L"Downloading File from URL: %S", result_url.c_str());
        op_curl = curl_easy_init();
        if (op_curl)
        {
            headerlist = curl_slist_append(headerlist, buf);
            curl_easy_setopt(op_curl, CURLOPT_URL, result_url.c_str());
            curl_easy_setopt(op_curl, CURLOPT_HTTPHEADER, headerlist);
            curl_easy_setopt(op_curl, CURLOPT_HTTPGET, 1L);
            FILE* wfd = fopen(json_result_file.c_str(), "ab");
            fprintf(wfd, "\n");
            curl_easy_setopt(op_curl, CURLOPT_WRITEDATA, wfd);
            curl_easy_perform(op_curl);
            curl_easy_cleanup(op_curl);
            fclose(wfd);
            WriteLogFile(L"FILE saved");
        }
        break;
    }
    readBuffer = "";
}    // While

Now once we have the results, we just clean up everything.

Here is a video demo of how the program works:

Building Blocks

One of the key building blocks to such project, would be libCurl. I used it as a static library. The .lib file is included in the article's source code, however you can read about using libCurl as a static library here.

Note: WriteLogFile() is one of my old logging functions described in this article.

Using the Code

You can use a different export format. See this link for the options.

You can define which languages you are expecting. Read this link for the options.

You can use many languages, most of them can be also as hand written text (ICR). You set the list of expected languages in the PROCESSING_URL string.

In this example, we expect English and Hebrew:

C++

#define PROCESSING_URL    
"https://cloud-westus.ocrsdk.com/processImage?exportFormat=txt&language=English,Hebrew"

History

30^th January, 2021: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)