Introduction
This application uses Tesseract OCR engine of Tesseract 3 which works by recognizing character patterns (https://github.com/tesseract-ocr/tesseract). Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages "out of the box".
Background
I tried the Google Text Recognition API- https://developers.google.com/vision/android/text-overview, but it was not suitable for me, so I found this amazing engine.
Using the Code
Let's start! Create a new project in Android studio (I used version 3.2.1) or you can download the source files and choose: File-New-Import project.
Add to build.gradle
app level:
implementation 'com.jakewharton:butterknife:8.8.1'
annotationProcessor 'com.jakewharton:butterknife-compiler:8.8.1'
implementation 'com.rmtheis:tess-two:9.0.0'
I use Butterknife
library, it's very useful and the main library is - 'tess-two:9.0.0
'' - it contains a fork of Tesseract Tools for Android (tesseract-android-tools) that adds some additional functions. Also, we need camera and write permissions, so add it to AndroidManifest.xml.
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-feature android:name="android.hardware.camera" />
<uses-permission android:name="android.permission.CAMERA" />
Make a simple layout file with Button
, TextView
and ImageView
:
="1.0"="utf-8"
<ScrollView xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:fillViewport="true"
tools:context=".MainActivity">
<LinearLayout
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:orientation="vertical">
<LinearLayout
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:orientation="vertical">
<Button
android:id="@+id/scan_button"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_gravity="center"
android:text="scan" />
</LinearLayout>
<LinearLayout
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_margin="4dp"
android:orientation="horizontal">
<TextView
android:id="@+id/ocr_text"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_gravity="fill"
android:text=" text">
</TextView>
</LinearLayout>
<LinearLayout
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:orientation="vertical">
<ImageView
android:id="@+id/ocr_image"
android:layout_width="match_parent"
android:layout_height="wrap_content" />
</LinearLayout>
</LinearLayout>
</ScrollView>
We get something like this:
Write some code to check permissions:
void checkPermissions() {
if (!hasPermissions(context, PERMISSIONS)) {
requestPermissions(PERMISSIONS,
PERMISSION_ALL);
flagPermissions = false;
}
flagPermissions = true;
}
public static boolean hasPermissions(Context context, String... permissions) {
if (context != null && permissions != null) {
for (String permission : permissions) {
if (ActivityCompat.checkSelfPermission(context, permission)
!= PackageManager.PERMISSION_GRANTED) {
return false;
}
}
}
return true;
}
And code to create a file:
public File createImageFile() throws IOException {
String timeStamp = new SimpleDateFormat("MMdd_HHmmss").format(new Date());
String imageFileName = "JPEG_" + timeStamp + "_";
File storageDir = context.getExternalFilesDir(Environment.DIRECTORY_PICTURES);
File image = File.createTempFile(
imageFileName,
".jpg",
storageDir
);
mCurrentPhotoPath = image.getAbsolutePath();
return image;
}
First, we need write onClickScanButton
function, it:
@OnClick(R.id.scan_button)
void onClickScanButton() {
if (!flagPermissions) {
checkPermissions();
return;
}
Intent takePictureIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
if (takePictureIntent.resolveActivity(context.getPackageManager()) != null) {
File photoFile = null;
try {
photoFile = createImageFile();
} catch (IOException ex) {
Toast.makeText(context, errorFileCreate, Toast.LENGTH_SHORT).show();
Log.i("File error", ex.toString());
}
if (photoFile != null) {
oldPhotoURI = photoURI1;
photoURI1 = Uri.fromFile(photoFile);
takePictureIntent.putExtra(MediaStore.EXTRA_OUTPUT, photoURI1);
startActivityForResult(takePictureIntent, REQUEST_IMAGE1_CAPTURE);
}
}
}
We can check the result here:
@Override
protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
super.onActivityResult(requestCode, resultCode, data);
switch (requestCode) {
case REQUEST_IMAGE1_CAPTURE: {
if (resultCode == RESULT_OK) {
Bitmap bmp = null;
try {
InputStream is = context.getContentResolver().openInputStream(photoURI1);
BitmapFactory.Options options = new BitmapFactory.Options();
bmp = BitmapFactory.decodeStream(is, null, options);
} catch (Exception ex) {
Log.i(getClass().getSimpleName(), ex.getMessage());
Toast.makeText(context, errorConvert, Toast.LENGTH_SHORT).show();
}
firstImage.setImageBitmap(bmp);
doOCR(bmp);
OutputStream os;
try {
os = new FileOutputStream(photoURI1.getPath());
if (bmp != null) {
bmp.compress(Bitmap.CompressFormat.JPEG, 100, os);
}
os.flush();
os.close();
} catch (Exception ex) {
Log.e(getClass().getSimpleName(), ex.getMessage());
Toast.makeText(context, errorFileCreate, Toast.LENGTH_SHORT).show();
}
} else {
{
photoURI1 = oldPhotoURI;
firstImage.setImageURI(photoURI1);
}
}
}
}
}
Next integrate Tesseract to our project, make additional class: TesseractOCR
.
I put trained data file "eng.traineddata" for an English language in Assets folder, so we need copy this from APK to internal memory files directory and then init the Tesseract system: mTess.init(dstInitPathDir, language)
.
public class TesseractOCR {
private final TessBaseAPI mTess;
public TesseractOCR(Context context, String language) {
mTess = new TessBaseAPI();
boolean fileExistFlag = false;
AssetManager assetManager = context.getAssets();
String dstPathDir = "/tesseract/tessdata/";
String srcFile = "eng.traineddata";
InputStream inFile = null;
dstPathDir = context.getFilesDir() + dstPathDir;
String dstInitPathDir = context.getFilesDir() + "/tesseract";
String dstPathFile = dstPathDir + srcFile;
FileOutputStream outFile = null;
try {
inFile = assetManager.open(srcFile);
File f = new File(dstPathDir);
if (!f.exists()) {
if (!f.mkdirs()) {
Toast.makeText(context, srcFile + " can't be created.", Toast.LENGTH_SHORT).show();
}
outFile = new FileOutputStream(new File(dstPathFile));
} else {
fileExistFlag = true;
}
} catch (Exception ex) {
Log.e(TAG, ex.getMessage());
} finally {
if (fileExistFlag) {
try {
if (inFile != null) inFile.close();
mTess.init(dstInitPathDir, language);
return;
} catch (Exception ex) {
Log.e(TAG, ex.getMessage());
}
}
if (inFile != null && outFile != null) {
try {
byte[] buf = new byte[1024];
int len;
while ((len = inFile.read(buf)) != -1) {
outFile.write(buf, 0, len);
}
inFile.close();
outFile.close();
mTess.init(dstInitPathDir, language);
} catch (Exception ex) {
Log.e(TAG, ex.getMessage());
}
} else {
Toast.makeText(context, srcFile + " can't be read.", Toast.LENGTH_SHORT).show();
}
}
}
public String getOCRResult(Bitmap bitmap) {
mTess.setImage(bitmap);
return mTess.getUTF8Text();
}
public void onDestroy() {
if (mTess != null) mTess.end();
}
}
OCR code is simple - we need pass image (bitmap BMP) to this object and get the result:
public String getOCRResult(Bitmap bitmap) {
mTess.setImage(bitmap);
return mTess.getUTF8Text(); }
OCR can take a long time, so we'll need to make it in another Thread
:
private void doOCR(final Bitmap bitmap) {
if (mProgressDialog == null) {
mProgressDialog = ProgressDialog.show(this, "Processing",
"Doing OCR...", true);
} else {
mProgressDialog.show();
}
new Thread(new Runnable() {
public void run() {
final String srcText = mTessOCR.getOCRResult(bitmap);
runOnUiThread(new Runnable() {
@Override
public void run() {
if (srcText != null && !srcText.equals("")) {
ocrText.setText(srcText);
}
mProgressDialog.dismiss();
}
});
}
}).start();
}
The source image is as follows:
Result of OCR is given below:
Points of Interest
If you are interested in using Tesseract OCR engines, I hope this simple article will help you. So you can easily improve this application. I like to develop applications, so you can try some of it on https://play.google.com/store/apps/developer?id=VOLOSHYN+SERGIY