C# OCR (How to Read a single character from image)

Question

4.00/5 (1 vote)

See more:

Hello,
I want to make and OCR project which recognizes a single character from and image(not a sentence from a document) in C#, can anyone help me out with some code? i was searching on internet about it, and i found 2-3 codes of OCR, but they were too difficult to understand, can anyone give me some simple code or some free library to do that thing.

Thanks in Advance

Posted 14-Feb-11 7:13am

cYpH3r x3r0

Add a Solution

3 solutions

Solution 3

I am useing the MODI (Office OCR) to do this. It is working like a charm for words, sentences, numbers and mixes of both. However - It fails if the image contains only one single character.
I have no idea why i fails, but since this is what you wanted to do, you should probably go for one of the other ocr solutions.

Is there anybody who knows why it fails with only one letter?

Here is my code:

C#

// Load Image from File
Bitmap BWImage = new Bitmap(fileName);
// Lock destination bitmap in memory
System.Drawing.Imaging.BitmapData BWLockImage = BWImage.LockBits(new Rectangle(0, 0, BWImage.Width, BWImage.Height), System.Drawing.Imaging.ImageLockMode.WriteOnly, PixelFormat.Format1bppIndexed);

// Copy image data to binary array
int imageSize = BWLockImage.Stride * BWLockImage.Height;
byte[] BWImageBuffer = new byte[imageSize];
Marshal.Copy(BWLockImage.Scan0, BWImageBuffer, 0, imageSize);
DoOCR(BWLockImage, BWImageBuffer, tmpPosRect, false);



// Do the OCR with this function
public string DoOCR(System.Drawing.Imaging.BitmapData BWLockImage, byte[] BWImageBuffer, Rectangle iAusschnitt, bool isNumber)
{
    Bitmap tmpImage = Bildausschnitt1bpp(BWLockImage, BWImageBuffer, iAusschnitt);
    string file = Path.GetTempFileName();
    string tmpResult = "";
    try
    {
        tmpImage.Save(file, ImageFormat.Tiff);
        _MODIDocument.Create(file);
        // Modi parameter erstellen
        _MODIDocument.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, false, false);

        MODI.IImage myImage = (MODI.IImage)_MODIDocument.Images[0]; //first page in file
        MODI.ILayout myLayout = (MODI.ILayout)myImage.Layout;
        tmpResult = myLayout.Text;
    }
    catch
    {
        if (_MODIDocument != null)
        {
            _MODIDocument.Close(false); //Closes the document and deallocates the memory.
            _MODIDocument = null;
        }
        // Bild freigeben
        tmpImage.Dispose();
        tmpImage = null;
        // Garbage Collector ausführen
        GC.Collect();
        // Bilddatei löschen
        File.Delete(file);
    }
    return tmpResult;
}

Posted 5-Sep-11 5:15am

Schreibdoch

Comments

amolnarkhede 21-Jun-12 1:16am

i have got ocr running error... plz help

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

**Orcun Iyigun** · Accepted Answer · 2011-02-14T07:55:00

Solution 1

As you said you need to use OCR(Optical Character Recognition). There is no inbuilt func for OCR in C# but maybe using Microsoft Office Document Imaging Library (MODI) might be helpful. Check out these links;

How To: Use Office 2007 OCR Using C#[^]

http://www.devsource.com/c/a/Languages/Using-The-Office-2007-OCR-Component-in-C/[^]

OCR with Microsoft® Office[^]

The free librarys that you can use;
- Tesseract
http://code.google.com/p/tesseract-ocr/[^]

-GOCR
[^]

Posted 14-Feb-11 7:55am

Orcun Iyigun

Comments

Espen Harlinn 14-Feb-11 15:53pm

Good links, my 5

Sergey Alexandrovich Kryukov 14-Feb-11 17:32pm

Pay attention, OP require C#. I know some good codes which I actually tried, please see my answer.
--SA

cYpH3r x3r0 15-Feb-11 1:29am

i m still pretty much confused because i want to use MODI and i dont know the proper procedure which must be followed... the link you gave me on MODI is not doubt very good, and i have downloaded it, but its giving me error that

The type or namespace name 'MODI' could not be found (are you missing a using directive or an assembly reference?)

i know i have to give some referrence to make it work, but i dont know "what referrence"?

Sergey Alexandrovich Kryukov · Accepted Answer · 2011-02-14T11:31:00

Tesseract and GOCR are not easy to use and not so good; Office is proprietary, not always available.

Thsre is C# binding for Tesseract called Tessnet: http://www.pixel-technology.com/freeware/tessnet2/[^].

There are some good works on CodeProjects. I think they are better but need to be completed. Basic and most difficult work is done. As you mentioned "single character", one of them must be ideal for you (I tried them out). See: Neural Network OCR[^] (this one is one of the best), Creating Optical Character Recognition (OCR) applications using Neural Networks[^], OCR Line Detection[^], Unicode Optical Character Recognition[^] (this one is one of the best).

Good luck,

—SA