Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Zero Footprint OCR with LEADTOOLS HTML5

7 Jan 2013 1  
Performing Optical Character Recognition (OCR) on mobile devices has always been a challenge due to the minimal processing power and storage space. LEADTOOLS is changing the game with its HTML5 / JavaScript image viewer control and RESTful Web Services.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Introduction

Performing Optical Character Recognition (OCR) on mobile devices has always been a challenge due to the minimal processing power and storage space.  LEADTOOLS is changing the game with its HTML5 / JavaScript image viewer control and RESTful Web Services.

LEADTOOLS has been a long time provider of award-winning OCR toolkits.  With its new HTML5 viewer and RESTful web services, you can create one incredible recognition application that will run on any desktop, tablet or mobile device.  Instead of making sacrifices in speed and features to properly function on a mobile device, developers can now enjoy all the benefits of a powerful desktop application.  Since the application is inherently cross-platform, programmers can save countless hours of time and energy testing for quirks and will have a much greater peace of mind when updating and maintaining the application.

Key HTML5 & OCR Features in LEADTOOLS SDKs

  • HTML5 / JavaScript Viewer Control for cross-platform image viewing
  • Works on any desktop, tablet or mobile device browser that supports HTML5
  • Supports both mouse and multi-touch (gesture) input
  • Interactive Modes Include:
    • Pan
    • Scale
    • Zoom to Rectangle
    • Center at Point
    • Magnifying Glass
    • Pinch and Zoom
    • Rubber Band
  • Display images based on physical and logical units
  • Built-in image manipulation for:
    • Rotate
    • Flip
    • Resize/Scale
  • Native HTML5 Image Annotation and Markup
  • Source code included for easy customization and branding
  • Extend with LEADTOOLS RESTful Web services to add advanced features such as extended file format support (e.g. TIFF, PDF, DOC, DICOM, etc.), OCR and Barcode
  • Fast, accurate and multi-threaded OCR engines for use in both desktop applications or high-performance server environments
  • Full page and zonal OCR
  • Broad language and character set support including Latin, Cyrillic, East Asian and Arabic
  • Powerful document image cleanup and preprocessing functions
  • Extract text from any color, grayscale or black and white image
  • Easily create flexible, powerful and efficient distributed OCR applications using the LEADTOOLS Cloud SDK

SDK Products that Include HTML5 & OCR Technology

HTML5 / JavaScript Viewer Control

This example builds on the example from our first HTML5 viewer article and shows how to call the OCR RESTful web service. All the code for the viewer is included in the attached example and is fully functional, but if you would like to see a more detailed explanation of those components check out our first article.

OCR RESTful Web Service 

The LEADTOOLS OCR RESTful Web Service is a simple way to add OCR functionality to any application without the need for downloading large language recognition libraries and executables.  It takes a simple set of parameters (source image and recognition area) and returns an easily parsed JSON structure with the resulting text.

In this demo, we show how to use the web service to perform two types of recognition: zonal and full page. The first is accomplished by selecting a small region or zone with the viewer’s built-in rubber band event. Using a mouse click and drag or touch screen finger swipe, the user can select a rectangle on the image, handle the event and then pass those coordinates to the web service.

Once the service is done processing as indicated by the onReadyStateChanged event, you can then use JSON to parse the response and display or use the recognized text as your application requires.  For this example we simply display the text in an alert box.

_selectRecognizeArea_RubberBandCompleted$1: function HTML5DemosLibrary__ocrDemo$_selectRecognizeArea_RubberBandCompleted$1(sender, e) {
   // Get the selected area and use that as a zone for the OCR service
   var searchArea = Leadtools.LeadRectD.fromLTRB(e.get_point1().get_x(), e.get_point1().get_y(), e.get_point2().get_x(), e.get_point2().get_y());
   var visibleRect = _viewer.imageControlRectangle(true);
   searchArea.intersect(visibleRect);
   searchArea = _viewer.convertRect(Leadtools.Controls.CoordinateType.control, Leadtools.Controls.CoordinateType.image, searchArea);
   if (searchArea.get_width() > 3 && searchArea.get_height() > 3) {
      this._recognize$1(searchArea);
   }
},
 
_recognize$1: function HTML5DemosLibrary__ocrDemo$_recognize$1(searchArea) {
   // display the loading gif while we wait
   this.beginOperation();
   
   // build the service request
   var rest = this.buildServiceUrl('ocr.svc');
   rest += '/GetText?uri=';
   rest += _viewer.get_imageUrl();
   var imageSize = _viewer.get_imageSize();
   rest += '&width=';
   rest += parseInt(imageSize.get_width());
   rest += '&height=';
   rest += parseInt(imageSize.get_height());
   if (!searchArea.get_isEmpty()) {
      // no selection was made, recognize the whole image
      rest += '&left=';
      rest += parseInt(searchArea.get_left());
      rest += '&top=';
      rest += parseInt(searchArea.get_top());
      rest += '&right=';
      rest += parseInt(searchArea.get_right());
      rest += '&bottom=';
      rest += parseInt(searchArea.get_bottom());
   }
   
   // create the request and event handler
   var request = new XMLHttpRequest();
   var _this = this;              
   var readyStateChanged = function() {
      if (request.readyState === 4) {
         if (request.status === 200) {
            var results = null;
            if (request.responseText != null && request.responseText.length > 0) {
               results = JSON.parse(request.responseText);
            }
            else {
               alert('No text was found in the specified area, please select another area that contains text and try again.');
            }
            request.onreadystatechange = null;
            request = null;
            _this.endOperation(false);
            if (results != null) {
               alert (results);
            }
         }
         else {
            _this.showRequestError(request);
         }
      }
   };
   
   // send the request
   request.onreadystatechange = readyStateChanged;
   request.open('GET', rest, true);
   request.send();
},

You may have noticed that if no rectangle is passed to the recognize function, it will create one for the entire image and then call the web service. Therefore all the programmer needs to do is create a very simple button handler to accomplish full page OCR.

var recognizeButton = document.getElementById('recognizeButton');
recognizeButton.addEventListener('click', function(e) {
   // recognize the entire image by sending an empty zone
   _this._recognize$1(Leadtools.LeadRectD.get_empty());
}, false);

Conclusion

LEADTOOLS provides developers with access to the world’s best performing and most stable imaging libraries in an easy-to-use, high-level programming interface enabling rapid development of business-critical applications.

HTML5 and OCR RESTful Web Services are only some of the many technologies LEADTOOLS has to offer.  For more information on our other products, be sure to visit our home page, download a free fully functioning evaluation SDK, and take advantage of our free technical support during your evaluation.

Download the Full HTML5 Example

You can download a fully functional demo which includes the features discussed above.  To run this example you will need the following:

  • LEADTOOLS V17.5 (free 60 day evaluation)
  • Run the configuration utility (C:\LEADTOOLS 17.5\Shortcuts\HTML5\01 Document\01 Local Demos\07 Run This First To Configure Demos and Services) to create the virtual directories for the HTML5 demos and RESTful Web Services
  • Copy LEADTOOLSHTML5RESTDemo.htm to C:\LEADTOOLS 17.5\Examples\JS\HTML5Demos (this is where http://localhost/LEADTOOLSHTML5Demos points)
  • If loading images hosted in your IIS, make sure you add MIME types for each format you wish to support (e.g. .j2k = image/jpeg-2000, .dcm = image/dicom, etc.)
  • LEADTOOLS also ships with a more fully-featured HTML5 / JavaScript OCR demo with additional tools such as image processing, interactive modes and more detailed event handling.

Support

Need help getting this sample up and going?  Contact our support team for free technical support!  For pricing or licensing questions, you can contact our sales team (sales@leadtools.com) or call us at 704-332-5532.

For More Information on HTML5 Imaging with LEADTOOLS

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here